Models
There are three code review models: pre-commit, post-commit, and the Git Fusion model. Which model you use for code reviews with Swarm is up to you.
Pre-commit model
The pre-commit model is possible due to the Helix Versioning Engine's shelving feature. Shelving enables you to temporarily make copies of your files available to other users without committing the changes into the depot. Shelving can be a very handy way for developers to create a backup, or to handle local workspace changes that might otherwise lose work in progress, without having to commit code that might destabilize a codebase.
Swarm uses the shelving feature in the Helix Versioning Engine to manage code reviews. Shelving allows reviewers to easily acquire a copy of the code to be reviewed, and allows updates to the reviewed code prior to submission.
Tip
For more information on shelving, see: P4 User's Guide: Shelving work in progress.
Post-commit model
The post-commit model can be used if your team's development processes preclude the use of shelving. Code must be committed to the Helix Versioning Engine before code review can begin, which reduces the opportunity to fix problems before, for example, a continuous integration system notices problems. However, code reviews can be started for any existing code regardless of how long it has been committed.
Git Fusion model
Perforce Git Fusion provides repo management for Git repositories, and provides workflows that enable Git and Perforce users to collaborate on the same projects using their preferred tools.
The Git Fusion model is similar to the pre-commit model; changes in your local repo can be pushed for review to a named Perforce branch in the Git fusion repo configuration, making your proposed changes available so that others can review and comment on them prior to committing them to the target branch. Git Fusion and Swarm work together to create a review branch and container for the pre-commit collaboration.
The Git Fusion model has several limitations that you should be aware of:
-
The target branch for Git Fusion-created reviews must be a fully populated branch, and must be listed in the repo-specific Git Fusion configuration.
See "Setting up Repos" in the Git Fusion Guide for details on converting a lightweight branch into a fully populated Perforce branch.
-
Reviews created with Git Fusion can only be updated from Git Fusion.
-
You cannot clean up history and then push your changes to the same review. If you perform a Git rebase, you should push your changes as a new review.
-
A Git Fusion review does not currently display the individual task branch commits that make up the review. Only the merged commit diffs are shown.
Tip
For more information on Git Fusion, see the Git Fusion Guide
Internal representation
Swarm-managed changelists
A code review consists of one or more shelved changelists that Swarm manages. A shelved changelist is a pending changelist that has a snapshot of its files on a shelf associated with the changelist.
When a review is started, Swarm creates a new changelist that becomes the review changelist. What happens afterwards varies:
-
If the review contains uncommitted work (the pre-commit model), Swarm copies the shelved files from the user's changelist that initiated the review into the review's changelist.
-
Any time that a user's changelist associated with the review has its shelved files updated, Swarm copies the shelved files into its review changelist and creates an archive changelist. An archive changelist is no different from any other pending changelist with shelved files, but it allows Swarm to provide versioning and diffs within a review.
-
If the head version of a review is committed (the post-commit model), the review's changelist is emptied of files.
The review's changelist is never actually committed; this allows the review to be opened later with additional shelved changes.
Important
Swarm's managed review changelists should only be deleted if you are uninstalling Swarm.
Swarm's review changelists maintain the history of a review and all of its feedback. The deletion of a Swarm shelved changelist causes instability and potentially data loss, and represents a scenario that can be very challenging to recover from, even with the engagement of Perforce consultants.
You can display a list of all of the Swarm-managed changelists using the p4 changelists command:
$ p4 changelists -u swarm
Change 1212285 on 2015/07/31 by swarm@swarm-96017af4-5615-9819-7af1-6fc1fa537214 *pending* 'Add requirements and instructions'
Change 1212284 on 2015/07/31 by swarm@swarm-96017af4-5615-9819-7af1-6fc1fa537214 *pending* 'Add requirements and instructions'
...
swarm
is the userid with
admin-level privileges within the Helix Versioning
Engine that Swarm is configured to
use. Use the appropriate userid when you run the p4
changelists command.
Swarm-managed workspaces
Whenever Swarm creates a changelist for a review, it uses a client workspace (or just workspace) associated with the configured Helix userid that has admin privileges. Whenever a user commits a change via Swarm's user interface, Swarm uses a workspace associated with that user.
Tip
To learn more about workspaces, see the section Helix as a version control implementation in the Introducing Helix guide.
The workspaces that Swarm creates and uses live in the
folder.
SWARM_ROOT
/data/clients
Inside the clients folder, Swarm maintains a user-specific folder that
contains any workspace folders that may be required. Each user-specific
folder is named by converting their Helix userid into hexadecimal to avoid
any characters that would be problematic in the filesystem, such as
slashes, accents, UTF-8 characters, etc. For example, the folder for the
user eedwards
would be named
6565647761726473
.
Within the user-specific folder are the folders that become the
root of each workspace. Each of these folders is
named with a globally-unique identifier (GUID) prefixed with
swarm-
, for example
swarm-438d482b-f107-9a35-c06c-86ac68136b00
.
Accompanying each folder is a lock file with the same name plus the
.lock
extension. Finally, the user-specific clients
folder contains a management lock file called
manage.lock
.
Here is an example of the folder structure:
SWARM_ROOT
/
data/
clients/
6565647761726473/
manage.lock
swarm-438d482b-f107-9a35-c06c-86ac68136b00/
swarm-438d482b-f107-9a35-c06c-86ac68136b00.lock
swarm-8388362a-233d-0cb9-3e90-895eaaa99f6c/
swarm-8388362a-233d-0cb9-3e90-895eaaa99f6c.lock
736c6f7264/
manage.lock
swarm-da7de4b4-0ecb-12c8-1b35-f3e32bb18033/
swarm-da7de4b4-0ecb-12c8-1b35-f3e32bb18033.lock
Here are the steps Swarm takes when it needs to use a client:
-
Convert the current connection's userid to hexadecimal.
-
Check to see whether a user-specific folder exists within
; if not, create the folder.SWARM_ROOT
/data/clients -
Within the user-specific folder, loop over any existing workspace folders and attempt to lock each in turn:
-
If a lock is acquired skip to the next step.
-
Otherwise, perform the create workspace procedure.
Create workspace procedure
-
Check if the max number of clients for the current user has been reached:
-
If so, wait a short amount of time (50 milliseconds), and start step 3 again.
-
If not, proceed to the next step.
-
-
Take a lock on
manage.lock
. -
Check if the max number of clients for the current user has been reached:
-
If so, release the
manage.lock
, wait a short amount of time (50 milliseconds), and start step 3 again. -
If not, proceed to the next step.
-
-
Create a new workspace folder using a GUID-based filename, and take a lock on the folder.
-
Release the
manage.lock
lock.
-
-
Perform the necessary file operations using the locked workspace folder.
-
Revert the file content within the workspace folder to avoid having constantly growing disk space use.
Note
There may occasionally be stray files left; Swarm is not aggressive about cleaning up.
-
Swarm releases the lock on the workspace folder.
Most users should only require 1-2 workspaces, and those are only required if they commit from Swarm. The admin user that Swarm is configured to use should only use one workspace per configured worker.
By default, the number of workspaces that could be active at any given instant is two times the number of configured workers. Since the default worker count is three, Swarm would use at most six workspaces simultaneously.
If the workspace limit is reached, further file processing is blocked until a workspace becomes available. Potentially, this means that users could encounter timeouts. Configuring Swarm to use more workers could solve that issue.
Removal considerations
Administrators might wish to remove Swarm-managed workspaces. There are a few considerations that should be assessed prior to removal:
-
Ideally, you should stop the web server (taking Swarm out of service) before removing a Swarm-managed workspace from the Swarm server; this eliminates the risk of removing a workspace that is in use.
If you do not stop the web server first, Swarm may encounter an error during a submit.
-
Removal of a Swarm-managed workspace folder does not remove the client spec from the Helix Versioning Engine. Unless the client spec is removed, that workspace effectively becomes orphaned. Orphaned clients are, of themselves, not a big concern as the storage and performance impact is negligible.
-
Removal of a Swarm-managed workspace's corresponding client spec in the Helix Versioning Engine can be done. However, you should never remove a client spec that has associated shelved files.
Usually, the only client specs that should have associated shelved files belong to the admin account that Swarm is configured to use. All other workspaces that may exist for other users are primarily used for submitting changes, and so should not have shelved files associated.