Problem with Mirroring of large Repositories

  • bug description (occurred issue):
    We use an Instance of SCM-Manager with the “Repository Mirror” Plugin. Mirroring large Repositories is nearly impossible.

  • expected result / system behavior:
    Adding a Repository as a Mirror. The initial clone of the remote Repo may take Minutes or an Hour?
    Updates to that Repository (via the set “Refresh Period”) should be fast, running only Minutes or Seconds.

  • observed result / system behavior:
    The initial Mirror can take up to several Days. Using 100% of one CPU Core. Sometimes it doesn’t finish after Days and we aren’t sure if it will finish at all.
    If the initial mirror succeeds, some periodic updates take also Hours or Days to complete. This leads to a permanent CPU Utilization of 1-3 CPU Cores of our prod. server which is hosting the SCM-Manager.
    This is reproducible with a fresh install of SCM-Manager. I’ve installed a new Instance and mirrored the Repository of the Nextcloud Android App as an example. (GitHub - nextcloud/android: 📱 Nextcloud Android app)
    The Screenshots are from this Test-Server, because I’m not sure how much I can share from our Company SCM-Instance.
    I’ve added the Mirror on 01.01.2024 at around 11 PM. To this Time, it is still running, showing CPU Usage, but no results in SCM itself.

  • SCM-Manager version and installed package:
    SCM-Server 2.48.3
    Plugin Repository Size 1.2.0
    Plugin Readme 2.1.0
    Plugin Statistic 2.2.2
    Plugin Repository Mirror 2.3.1

(Please also add Screenshots of the issue + if possible a trace created by the SCM-Manager support plugin)
The mirrored Repo:

Monitoring showing Server vitals over several Hours:
(VM was fresh created, monitoring added and then the mirror added)

As comparision, “git clone --mirror” of the same Repo on the same Server:

Hi,

thanks for this report. Apparently this massive drop in performance (which we have to confirm) came due to a fix regarding LFS files. It looks like we havet to find another way to check for LFS files or maybe we will add a switch to ignore LFS. However, the current performance is not acceptable.

We will get back to you when we found a way.

Stay tuned,
René

1 Like

Hi René,
thank you for verifying this Problem.
We are currently not sure how to proceed, because we need the Mirrors to work and to be up-tp-date.
Can you tell us what priority this Problem has for you?

Hi @l.bauer we are currently working on it and are looking for a solution.

Hi @l.bauer ,

we are still working on a performance optimization, but this may take a little longer. As a workaround we released a new version of the mirror plugin with an option to ignore LFS files.

Hope this works for you!
René

Hi @pfeuffer ,

unfortunately we don’t see a change in behavior. Neither in my Test-Setup (Cloning the Nextcloud-Android Project) nor in our Production-Setup.
In both cases I updated the SCM-Server via apt and the Plugins via the build functionality of SCM-Server.
I’ve deleted the old repos, restart SCM, created new Repos (with the new Option checked), still high CPU Usage and no changes in the Repo.

Screenshots of the Test-System: (Same System as in the last Screenshots)


image

Hi @l.bauer ,

I’m afraid you’re right. We’ve messed up merging global and local configurations. We have to apologize for this. We will fix this (hopefully better this time) and get back to you.

Regards
René

Thank you, the latest Update for the Mirror Plugin fixed the Issue for us (as we don’t use LFS).

For anyone with many Repos that need the new Config option:

cd /var/lib/scm/repositories/
find . -maxdepth 4 -type f -iname 'mirror.xml' > repos.txt; # find all Repos with a Mirror-Config File
cat repos.txt | while read line; do sed -i "`wc -l < $line`i\\  <ignoreLfs>true</ignoreLfs>\\" $line; done # Add the new Config-Parameter to the second to last line of all Mirror-Config Files
systemctl restart scm-server.service # making sure the changed configs are used by scm
1 Like

That’s great, thanks for the feedback!

1 Like