Creating a Standalone SAROS Server

worked on by: Denis Washington

deadline: January 4, 2016

Outline

Based on the work of Nils Bussas and the ongoing work to create a core library for Saros implementations, this thesis has the goal to create a deciated Saros session server which is able to host a session on behalf of the session's participants, rather than giving one of the participating users this role. Such a server has the advantage that the session is not dependent on the presence of any particular user in the session - every user will be able to freely join and leave while the session continues on the server, possibly even for days and weeks.

This effort consists of three major tasks:

  • The completion of the Saros protocol enhancements needed to make server-based sessions possible - more specifically, a way for users to initiate a session on the server, and the possiblity for non-host participants to add projects to a session. Nils Bussas' prototype server built into Saros Eclipse serves as basis for this work.
  • The creation of the actual standalone server. This includes moving some necessary components to the Saros core that are currently duplicated between Saros Eclipse and Saros IntelliJ (such as SarosSession and SarosSessionManager). The resulting implementation is not expected to be production-ready, but good enough to serve as basis for futher work.
  • An evaluation of the newly created server, especially concerning its fitness for hosting long-running sessions. This will consist of a mix of black-box (e.g. stress testing) and white-box (code review and analysis) approaches and focus on threats to long-term reliability such as memory and other resource leaks.

Thesis Requirements

* A standalone Saros server program which allows users to create or join a single session and share projects. At a minimum, changes to files should be synchronized between participants as in a classic host-based session. Some advanced Saros features might not work yet. * ?

Milestones and Planning

TODO

Milestone no. PastSorted ascending days CW Goals target accomplished wrench
2     35 SarosSessionManager in core no (last refactorings missing)
1 DONE     Project sharing by non-hosts yes (patch #2264)

Weekly Status

Week 8 (CW 35)

Activities

Worked towards moving SarosSession and SarosSessionManger to the core.

Results

Managed to create a patch series that moves SarosSession to the core, with more changes still needed than originally thought. Some of these patches still need work or need to replaced with different approaches until everything can be merged, but some patches already made their way into master.

The goal of moving SarosSessionManager to the core has not been fully met; this work has to be continued and completed next week.

Pushed Patches:

Approved Patches:

Merged Patches:

Patches Needing Work:

Problems

The future of SarosSessionTest needs to be decided (see #2852). It is more or Eclipse-specific integration test for session startup, thus cannot easily be moved to the core together with SarosSession. A possible approach might be to simply remove this test class and write a much simpler SarosSessionTest for the core.

Next Steps

  • Resolve problems with the pushed patches
  • Finish the move of SarosSessionManager to the core
  • Create a project skeleton for the Saros server
  • If time permits, start to work on an implementation of the filesystem interfaces that directly uses the Java standard library, starting with IPath

Week 9 (CW 36)

Activities

  • Fixed issues in and merged some of the patches pushed last week
  • Uploaded the last patches needed to move SarosSessionManager to the core
  • Started working on a core implementation of the IPath interface

Results

The work for the move of SarosSessionManager is now essentially done (except for possible work needed on the patches in review). There is the start of an IPath implementation (#2867), but there are still several interfaces to implement until there is a complete core implementation of the Saros filesystem interfaces.

Pushed Patches:

Updated Patches:

Abandoned Patches:

Merged Patches:

Approved Patches:

Problems

None this week.

Next Steps

  • Wait for review comments for the not-merged-yet patches and fix possible issues
  • Finish the IPath implementation and continue implementing further filesystem interfaces

Week 10 (CW 37)

Activities

  • Merged more of the patches for moving SarosSessionManager to the core
  • Created an Eclipse project for the Saros server and started writing an Ant build file for it
  • Wrote a new IPath implementation based on java.nio.Path
  • Started to prepare the kickoff presentation for my master thesis

Results

SarosSessionManager's move to the core is now merely waiting for an approval of #2852. Stefan gave a +1 to the patch, but due to Franz' early objections to the patch, I am holding off the merge for now until this has been discussed.

The server now has an (so far empty) Eclipse project. I have set the project to require Java 7 as this allows me to make use of the java.nio.file API, which will greatly simplify the implementation of the Saros filesystem interfaces. I already reaped the benefits of this decision by replacing the IPath implementation I started last week with one wrapping java.nio.file.Path; the result is smaller and at the same time more complete and robust than the previous approach.

New Patches:

Changes to Existing Patches:

Problems

I am still asking myself how to best write the server's Ant build file for CI, as well as how the CI integration actually happens. Will ask Stefan and Franz about this.

Next Steps

  • Continue implementing filesystem interfaces for the server
  • Discuss with Franz how to move on with #2852
  • Finish my kickoff presentation

Week 11 (CW 38)

Activities

  • Completed work on the kickoff presentation slides
  • Talked with Franz about #2852

Results

The slidedeck for the kickoff presentation is now complete. Unfortunately, I was not able to push any code changes this week.

We decided that #2852 can be abandoned alltogether.

Problems

None.

Next Steps

  • Continue implementing filesystem interfaces for the server
  • Create Jenkins jobs for the server
  • Revise server build file

Week 12 (CW 39)

Activities

  • Tweaked the Ant build file for the server
  • Created Jenkins jobs for server-related Gerrit patches
  • Extended the filesystem interfaces for the IPath and IResource implementations
  • Wrote server implementation of IResource (not pushed yet)

Results

There is now a Jenkis job Saros-Gerrit-QA-Server which is run as part of the CI build when a new patch set is pushed to Gerrit. This ensures that server patches are properly tested before they are reviewed and merged. However, the required build file is not merged yet (#2880), so other server patches need to be rebased on top of the build patch for the time being.

The patch for implementing IPath (#2877) was slightly revised. I also implemented most of the methods of IResource locally (excluding those that need to be implemented per-resource-type), but didn't create a patch for the code yet because it requires two additions to the filesystem interfaces - IResource#getLocation and IPath#segment - so I pushed those first. ServerResourceImpl will be pushed as soon as those two patches have been merged. (I am not able to push it earlier because I cannot rebase it on top of the interface commits and the build file patch.)

The patch to move SarosSessionManager to the core was not merged yet because Stefan noted that LeaveAndKickHandler needs to be moved first. This class has a dependency on the Eclipse-specific SarosView class that needs to be removed. I pushed #2902 to prepare for this.

New Patches:

Changes to Existing Patches:

Problems

None.

Next Steps

  • Get build file patch merged
  • Create a server CI job for master when the build file is merged
  • Push the IResource implementation
  • Continue implementing filesystem interfaces for the server

Week 13 (CW 40)

Activities

  • Merged build file for the server
  • Released the IResource implementation for review
  • Wrote server implementation of IFile

Results

I merged the Ant build file for the server. Now all server patches are properly tested and the results posted to Gerrit. Still missing are SonarQube reports, the configuration file for which is still in review as patch #2890. Franz found a few issues in the file which still need to be fixed.

Work on the server filesystem implementation continues. I uploaded the IResource implementation (#2904) and a draft for an implementation of IFile (#2907).

New Patches:

Problems

None.

Next Steps

  • Get SonarQube working with the server
  • Complete IFile implementation
  • Continue implementing other filesystem interfaces

Week 14 (CW 41)

Activities

  • Continued work on filesystem interfaces

Results

Most filesystem interfaces are implemented now (see new patches). Only IWorkspace is still missing.

Regarding the setup of SonarQube for the server, I found out that changes on the SonarQube admin interface are likely needed. Franz will provide me with SSH access to the Saros build server so that I can expose the admin interface and do any required changes.

New Patches:

Updates to Existing Patches:

Problems

None.

Next Steps

  • Implement IWorkspace
  • Configure SonarQube for the server
  • Start getting the server to run

Week 15 (CW 42)

Activities

  • Implemented IWorkspace
  • Started work towards a runnable functioning server

Results

IWorkspace is now implemented (#2921). The set of filesystem interface implementations is thus complete and just need to go through review.

I sent my public SSH key to Franz and now have access to the build server in order to make the SonarQube admin interface accessible and configure it for the Saros server. This is still a TODO.

I started work on getting the server running. The goal here is to quickly get to a state where a working server session is possible, create a big draft patch, and then revise and break out master-worthy patches piece by piece.

So far I cherry-picked some in-review patches (such as the one moving SarosSessionManager to the core), implemented a few more core interfaces and wrote a simple main class. I have reached the point where the server is able to start, successfully connect to the XMPP network and advertise itself as a Saros server; however, the session creation process still hangs for reasons that still need to be investigated.

New Patches:

Problems

None.

Next Steps

  • Continue working towards a working session with the server

Week 16 (CW 43)

Activities

  • Got server session creation and adding projects to work

Results

Work on a functioning server has yielded some results. After fixing some problems mostly related to mostly threading issues (such as an INegotiationHandler server implementation which blocked the message receiver thread and thus made it impossible for negotiations code to receive messages, as well as a faulty UISynchronizer implementation which blocked on nested syncExec() calls), I was able to successfully create a server session and add projects to it from an Eclipse with patch #2602 applied.

However, when I do changes to a project after adding it, these changes are not applied to the server. It turned out that all code that consumes the activities and acts upon them is located in the Saros/E and Saros/I sources and is IDE-specific. I thus need to write server-specific AbstractActivityConsumer implementations which apply activities to the server's local workspace.

Problems

None.

Next Steps

  • Write server-specific activity consumers

Week 17 (CW 44)

Activities

  • Implemented file system synchronization in the server
  • Uploaded as "the huge server patch" to Gerrit

Results

The main focus of this week was completing a first working version of the standalone server. The last missing piece - handlers for file and editor activities which apply the described changes to the server's local workspace - is now implemented. This includes a partial implementation of IEditorManager which caches the text content of recently changed files to reduce disk I/O. (Batched writing of changes could avoid even more I/O; right now, every text edit by a client is immediately saved to disk.)

I took all of the code I have written for the server and uploaded it as [[http://saros-build.imp.fu-berlin.de/gerrit/2929][#2929 ("the huge server patch")]]. This patch also includes all changes of the patches it is based on, but which are not yet merged into master (such as the server-specific filesystem interface implementations). This patch is not intended for merging, but only for testing the server in its current state.

Besides this, I got annoyed by the large amount of warnings seen in the Eclipse Problems and took the time to create a few patches that fix some classes of them (e.g., "empty block need comment" warnings). Together, these patches should decrease the warning count by about 150.

New Patches:

Changes to Existing Patches:

Problems

None.

Next Steps

  • Start splitting out mergeable parts of the huge server patch
  • Finally configure Sonarqube for the server

Week 18 (CW 45)

Activities

  • Finished configuring Sonarqube analysis of the server
  • Extracted first patch from the huge server patch

Results

I finally succeeded in getting Sonarqube analysis and reporting for new patches to work. It turned out that no configuration was missing in the Sonarqube server, but only for the custom sonarqube-gerrit-bridge, which needed an entry for the server in its configuration file.

Now Sonarqube works in general, but the analysis crashes for the huge server patch for unknown reasons, probably in a checker when analyzing a specific section of the code. Franz and I agreed to just wait until one of the extracted small patches causes the same crash, then debug the issue there.

Speaking of extracted patches, I succeded in extracting the UISynchronizer implementation from the huge server patch and push it as a separate one which also includes unit tests. It took a while to find out a nice way to split a commit into two pieces, but after a bit of Googling and practice I know feel that I can easily repeat this procedure for future patch extractions.

New Patches:

Changes to Existing Patches:

Problems

Except for the Sonarqube crash issue explained in the previous section, none.

Next Steps

  • Continue work on extracting patches, getting them merged
  • Start looking more deeply at the STF framework for the purpose of writing a stress test for the server

Week 19 (CW 46)

Activities

  • Looked at STF framework, added possibility for joining a server session
  • Tweaked the huge server patch

Results

For the planned server stress test, I looked at how to write tests with the STF framwork. Thankfully, the process looks very simple. What was mising was a possiblity to join a server session rather than creating one as the host and inviting others. I added a method to the framework to do this, which was surprisingly simple.

Other than that, I found a serious problem in the huge server patch: the server would not route activities it received from one client to the others. After quite a bit of debugging, I found out that the SharedProjectMapper used IProject instances as hash keys and thus expected equals() and hashCode() to be implemented. Because this wasn't the case in the server implementation, the code couldn't find the list of shared files for any project and thus concluded that no file was currently shared; as a result, no activities were sent on the server side. This problem is now fixed in the newest patch set and will be split out into a smaller patch later.

I didn't continue extracting patches out of the huge server patch because there are enough pending server patches already that need merging. I will concentrate on getting these merged before extracting new ones.

Changes to Existing Patches:

Problems

None.

Next Steps

  • Begin writing the server stress test
  • Continue work on getting existing server patches merged

Week 20 (CW 47)

Activities

  • Created patches for moving LeaveAndKickHandler to core
  • Found and fixed a server session inconsistency bug

Results

I updated the huge server patch to get rid of the hardcoded XMPP credentials and workspace directory path. Now, both of these can be passed to the server on startup via the Java system property mechanism. I also added a SarosServer launch configuration which passes usable defaults for these options. Last but not least, I added a component which listens for and applies FolderActivitys in addition to FileActivitys.

In order to remove the last blocker for a merge of the SarosSessionManager-to-core patch, I created a set of three patches which move LeaveAndKickHandler to the core. While at it, I resolved the class' FIXME comment by refactoring the handler into a session component.

While starting to complete the stress test for the server, I noticed that if two clients continuously edit the same text file during a server session, their file versions begin to diverge after a while. Even more strangely, this problem only occurs on Windows. I suspected that the problem is file-system-related, so I commented out the code that saves received text edits to disk in ServerEditorManagerImpl; the problem disappeared. (If the saving is replaced with Thread.sleep(1000), the problem still doesn't appear, so it doesn't seem to be a pure timing problem.)

Further digging revealed that I implemented the setContents() of ServerFileImpl naively, by writing the new contents to the file directly. This is fragile because the write isn't atomic. On Windows, it has the additional disadvantage of locking the file during the write. I changed the code to write the new contents to a temporary file instead, then move that file atomically to the correct destination. (This is how saving is implemented in most text editors.) This caused the problem to disappear completely. It still bugs me that I don't know exactly what the problem was, but as this change was needed anywhere, I guess it's fine.

New Patches:

Changes to Existing Patches:

Problems

Now that the strange server session bug has been fixed, none.

Next Steps

  • Continue with the server stress test
  • Continue work on getting existing server patches merged

Week 21 (CW 48)

Activities

  • Improved integration of server into the Saros Gerrit CI job

Results

I met with Holger to work on the Saros Gerrit CI job (now named Saros-Full-Gerrit). This job is configured to use Java 6, but the server and now Saros-IntelliJ require Java 7. The solution was to create subprojects for these two (Saros-Server-Gerrit and Saros-IntelliJ-Gerrit) and call them from Saros-Full-Gerrit. We were then configure the subprojects to use Java 7.

Stefan found another problem with the SarosSessionManager patch. There were three packet handlers left in Saros/I which were duplicating work of NegotiationPacketListener, a class which is being moved together with SarosSessionManager in the same patch. I updated the patch to remove the obsolete handlers.

I didn't accomplish much else this week, unfortunately, other than writing a little bit on the master thesis.

Changes to Existing Patches:

Problems

None.

Next Steps

  • Continue with the server stress test
  • Continue work on getting existing server patches merged
  • Continue writing the master thesis

Week 22 (CW 48)

Activities

  • Continued writing the master thesis
  • Wrote, ran second stress test (server session joins and leaves)

Results

I continued black-box evaluation of the server by writing a second stress test. Instead of concurrente text edits, this one continuously logs into a server session and leaves it again. Like the test before it, this one didn't expose any memory leaks or other metrics. I will probably continue writing one or two other tests.

The SarosSessionManager patch is finally merged. Yay! It's in the core now.

Other than that, this week was mostly about writing my thesis.

Changes to Existing Patches:

Problems

None.

Next Steps

  • Same as before

Week 22 (CW 49)

Activities

  • Fixed and cleaned up non-host project negotiation patch (#2264)
  • Continued writing thesis

Results

While writing my master thesis about the non-host project negotiation I implemented with Ute Neise many months ago, I double-checked if the code handles all use cases (e.g. editing while the project archive is being sent) it was supposed to handle. Unfortunately, I found out that the patch doesn't handle them, although I was sure I did (I rebased the patch multiple times since - maybe I was sloppy and broke something …). With my much-grown-over-the-months knowledge of the Saros internals, I spent half a day's work to greatly improve the patch and make it much smaller as well. Now the edge cases are handled without data loss as expected, and the host even doesn't have to share projects with the non-host it just received them from, as was the case before. Success!

Other than that, I continued writing my thesis.

Changes to Existing Patches:

Problems

None.

Next Steps

  • Same as before