Introduction to the incremental analysis mechanisms
Incremental analysis may be used to shorten the main branch analysis, the branch analysis, and the pull request analysis.
Different mechanisms may be used:
- Unchanged files are skipped from the analysis for files that can be processed independently by the analyzer.
- The analysis cache mechanism allows reusing previous analysis results.
This section explains both mechanisms.
Skip unchanged files mechanism
This mechanism is used for pull request analyses. With this mechanism, the analysis of (particular) unchanged files (compared to the target branch) is either skipped or optimized:
- For languages like CSS, HTML, XML, Apex, Go, Ruby, and Scala, where all files can be analyzed independently: the scanner only supplies modified files to the analyzer. This means that only the changed files are analyzed.
- For languages like Kotlin, Java, and JavaScript:
- The analyzer either skips particular unchanged files or optimizes the analysis of these files. For more information, see the respective language section in this documentation.
Analysis cache mechanism
With the analysis cache mechanism, the metadata of a branch analysis is cached on the server side at the end of the analysis. This way, this data is available to the analyzers for future analysis.
The analysis cache mechanism is supported for the following languages:
- To shorten a branch analysis: C, C++, Objective-C, and COBOL.
- To shorten a pull request analysis: C, C++, Objective-C, Java, JavaScript, TypeScript, Kotlin, PHP, and Python.
Caching process
For each branch, the server manages a single analysis cache that corresponds to the latest analysis of the branch.
The caching process is as follows:
- Before an analysis, the SonarScanner downloads from the server the corresponding cache:
- For a branch analysis: the cache of the branch being analyzed.
- For a pull request analysis: the cache of the target branch.
- Or, as a fallback, the cache of the main branch.
- During the analysis, the analyzers can access the cache locally to read and/or write to the cache.
- At the end of the analysis:
- For a branch analysis: the SonarScanner uploads the new cache of the branch to the server (overwriting the existing one).
- For a pull request analysis: the SonarScanner doesn’t upload the cache of the pull request branch (the cache is not persisted).
Note that:
- If the SonarScanner for .NET is used then the scanner version 5.12 or higher is required.
- Branches that are not scanned for more than seven consecutive days are considered inactive, and the server automatically deletes their cached data to free space in the database.
- With the C/C++/Objective-C analyzer, you can also configure the change of the cache storage to the local filesystem. However, this configuration should be used only in very specific use cases.
Analysis optimization
The way the analyzer optimizes the analysis based on the cached data depends on the language. For most analyzers, the optimization will be similar to the optimization done by the C/C++/Objective-C analyzer described below. The optimization done by the Kotlin analyzer is different.
C/C++/Objective-C
During a branch analysis, the C/C++/Objective-C analyzer analyzes only the code sections that are affected by the changes in the branch compared to the previous branch analysis.
During a pull request analysis, the analyzer analyzes only the code sections that are affected by the changes compared to the target branch.
To decide whether a code section is affected by the changes, the analyzer queries the loaded cache for information. It checks if the cached analysis results can be reused (cache hit). To do so, it checks various conditions such as cross-file dependencies, quality profile setting changes, build setting changes, etc. :
- If there is a cache hit, the analyzer leverages the previously stored analysis results, and thus, saves time.
- Otherwise, the analyzer performs a new analysis of the concerned code.
Kotlin
During a branch analysis, the Kotlin analyzer stores the copy-paste duplication (CPD) tokens to provide accurate duplication information on pull requests.
During a pull request analysis, the analyzer re-uses the CPD tokens cached during the last target branch analysis for files that have not changed compared to the target branch.
Related pages
Was this page helpful?