GitOps Playground: no jobs shown in Jenkins

Hi,

I am trying to run the GitOps Playground on a Ubuntu Desktop (20.04.3) virtual machine by applying the “Demo on local machine” scenario. SCM Manager and ArgoCD are fine. When I open the link to the Jenkins instance I can log in but see a 404 error message afterwards. I am able to access the Jenkins dashboard in the browser but no jobs are shown. On the Ubuntu VM I installed docker as a snap package from Ubuntu’s app center and made it executable as a “normal” user.

Would be great if someone has ideas/suggestions what could cause the problem. Thanks in advance.

1 Like

Hi khedr0n,

Thanks for your question! :slightly_smiling_face: Any feedback and help on improving is appreciated!

What exactly do you mean by there are no jobs - do you see the job folders for the operators you installed? That is at default one per operator argocd-applications, fluxv1-applications and fluxv2-applications.

If you can see those folders have you tried to access one of the folders and scan manually for buildjobs using Scan Multibranch Pipeline Now ? Does it give any erros?

Since jenkins uses the scm-manager in the pre-configured builds - does the scm-manager contain all the necessary repositories? It should come with pre-configured repositories for the operators and the applications itself? Do the repositories contain the necessary application data like Jenkinsfile?

If there are no folders at all, could you please provide some more information on

  • What installation method you used? bash command utilising the container or did you clone the repository and manually executed the scripts?
  • Did the installation succeed and printed out valid local addresses for all applications?
  • Is there any output from the installation and configuration process you can provide?
  • Are there any errors in the logs (jenkins, scm-manager, argocd)?

If you have any more questions, or encounter any other problems please feel free to ask!

Kind regards

Hi,

I’ve created some screenshots to show what I am experiencing.

I use the following command to setup the playground:
bash <(curl -s
https://raw.githubusercontent.com/cloudogu/gitops-playground/main/scripts/init-cluster.sh)
&& sleep 2 && docker run --rm -it -v ~/.k3d/kubeconfig-gitops-playground.yaml:/home/.kube/config
–net=host
ghcr.io/cloudogu/gitops-playground --yes

The setup is successful and I get the following output in the terminal:

After following the link for Jenkins I see the login screen and can login with the credentials provided. The result:


The dashboard looks as follows:

The SCM Manager contains a lot of repos. And at least 1 contains a Jenkinsfile. I didn’t check all repos so far.

Best regards,

Khedron

Thank you for your detailed information.

Can you please provide the log for the jenkins pod running inside the k3d-cluster?
If your kubecontext is set correctly to the local k3d cluster that got created by the init-cluster script from the github repo, its just

kubectl logs jenkins-0

else you have to set that kubecontext via flag --kube-context=k3d-gitops-playground
(This is the default context name)

Also I just tried it using the latest image version which worked for me. We’ve had casc issues with the jenkins in the past.
Could you tell me which image version of the gitops-playground you were using?

docker images --filter "label=org.opencontainers.image.title=gitops-playground"

This one lists the images present on your machine. If you were not using the image with the latest tag 58382b7 - could you please retry an installation with this one?

bash <(curl -s \ https://raw.githubusercontent.com/cloudogu/gitops-playground/main/scripts/init-cluster.sh) \ && sleep 2 && docker run --rm -it -v ~/.k3d/kubeconfig-gitops-playground.yaml:/home/.kube/config \ --net=host \ ghcr.io/cloudogu/gitops-playground:58382b7 --yes

Kind regards

Hi,

Thanks for the quick answer. I tried the image tagged “58382b7”. The result is the same, i.e. no changes at all. I included the Jenkins logs to this post. jenkinslogs.pdf (120.9 KB)

Best regards,

Khedron

Hey khedr0n,

thank you for providing the jenkins.log. Seems like there are some plugins that failed to install. But in general it should work since the plugins needed for that folder job exist on your instance.

I’d be glad if you help me to debug whats wrong. Could you please create those folder jobs manually ( or at least one of them, e.g. argocd), im going to describe how below. If it does not work can you please share the updated jenkins-log?

1. Create a new Item

2. Select organization folder and give it a name (e.g. argocd-applications)

3. Configure scm-manager

Hit save and it should trigger a folder scan and should pick up all 3 applications nginx-helm, petclinic-helm and petclinic-plain.

In case this does work and you cannot create that folder, it must have been a bug within the installation routine inside the container. I’m afraid we cannot debug the installation process further (since we remove the container after installation with rm flag and therefore all the logs.).
But you could re-install the gitops-playground and append the --debug flag and make sure to persist the output from stdout into a txt file like so:

bash <(curl -s \
  https://raw.githubusercontent.com/cloudogu/gitops-playground/main/scripts/init-cluster.sh) \
  && sleep 2 && docker run --rm -it -v ~/.k3d/kubeconfig-gitops-playground.yaml:/home/.kube/config \
    --net=host \
    ghcr.io/cloudogu/gitops-playground:58382b7 --yes --debug > playground.log

Thanks in advance and kind regards

Hi,

thanks for response. I could configure the argocd job in Jenkins successfully:

But the build fails. I can see the sync status as “unknown” in the argocd UI.

As you proposed I started the playground in debug mode for the container. I attached the Logfile for your information.

plagroundlog.pdf (220.3 KB)

Best regards,

Khedron

Hi,

I see the same experience more or less.
But to give you more information about my environment, I did the playground local machine demo on two different Ubuntu systems.
As I got the information, that the playground should work on any modern Linux system in general, but that you are using Ubuntu, I followed that advice and installed Ubuntu Desktop (Kubuntu, to be more precise) and Ubuntu Server edition on a second systems.

The results are different, seeing issues with both versions. But first, let me show you my results regarding Ubuntu Desktop

Environment:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.3 LTS

The Ubuntu system has following packages installed: docker-ce, docker-ce-cli, containerd.io, curl, ssh, ntp, tigervnc-server, lsb-release, gnupg, ca-certificates, apt-transport-https

I used the following docker command to start the playground:
docker run -it -v ~/.k3d/kubeconfig-${CLUSTER_NAME}.yaml:/home/.kube/config --net=host ghcr.io/cloudogu/gitops-playground

The setup looks successful on the first view, as all expected output is the same as written here already, e.g. welcome text with URLs. But I can´t find the logfile /tmp/playground-log-…

After using the URL http://localhost:9090/job/argocd-applications/, it seems that this folder is empty.
Please see the following screenshot:

Moving back to the dashboard, the following folders are visible:

But if I move to any of these three folders, I just see the information that the folder is empty.

Now, I will continue with the advice to use that specific version 58382b7 and --debug to create the log file as well.
Will try to create the pipeline manual as well afterwards.

But a small information at this place: if I´m using the Ubuntu Server version, I don´t see that result regarding the empty folders in Jenkins. What I cannot say at the moment, if the used image is maybe different, as I installed it also with:
docker run -it -v ~/.k3d/kubeconfig-${CLUSTER_NAME}.yaml:/home/.kube/config --net=host ghcr.io/cloudogu/gitops-playground

But will use that specific version in Ubuntu Server now in parallel to compare the results.
But I see different strange behavior in the Server version inside ArgoCD, but this will be another topic I´ll post a bit later.

Cheers,
Sascha

1 Like

Hi again,

after using version 58382b7, the folder argocd-applications and fluxv2-applications are still empty. But folder fluxv1-application shows now something inside. That is for the Desktop system.

But if I go inside, I only see the shining sun for nginx-helm. Please see the following screenshots.

Please find attached the log file which had been written using the --debug switch. That log file had been created on the system which is based on Ubuntu Desktop.

playground_ubu_desktop.log.pdf (237.8 KB)

For me it seems that there is no difference between the image latest and the one tagged with 58382b7. Both are showing the same Image ID:
#docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
paketobuildpacks/run base-cnb 6ba0f9e0a837 6 days ago 90.5MB
ghcr.io/cloudogu/gitops-playground 58382b7 b4898b250f9f 2 weeks ago 223MB
ghcr.io/cloudogu/gitops-playground latest b4898b250f9f 2 weeks ago 223MB
rancher/k3s v1.21.2-k3s1 b41b52c9bb59 3 months ago 172MB
ghcr.io/cloudogu/helm 3.5.4-1 7d8efb6abf52 5 months ago 79.3MB
cytopia/yamllint 1.25-0.7 9fc4804b0e4d 11 months ago 41.1MB
lachlanevenson/k8s-kubectl v1.19.3 173611a3c52b 11 months ago 49.1MB
gcr.io/paketo-buildpacks/builder base-platform-api-0.3 ea8eca12c563 41 years ago 787MB
localhost:30000/spring-petclinic-helm 202110051635-b86138f-main 9472c0bd4699 41 years ago 262MB
localhost:30000/spring-petclinic-plain 202110051635-5aba350-main 9472c0bd4699 41 years ago 262MB

Now comes the moment, where I had a look on the results of the second systems I prepared. It is based on Ubuntu Server. The same packages are installed as on the Desktop one.

But now I see a result that I absolutely did not expect. I ran the cluster setup three or four times during the last three days. All the time, the setup finished that way, that the Jenkins pipelines where available (please see a screenshot before I used the docker command with the build version and with --debug.
But to recap the information I saw in this post, I did it again on two systems to compare Server and Desktop, as I always saw it working on my server installation.
But as written earlier, I also had some issues within ArgoCD. But before I can provide information about it, I have to come back to a proper Jenkins situation :(.

So, please find the log output of the Ubuntu Server system:
playground_ubu_srv.log.pdf (238.2 KB)

Also some screenshots, the first one showing the situation inside, before I did a re-setup of the cluster:

Now, after the cluster re-setup:

fascinating…

The overall picture is not really reasonable.

  • On the Desktop system, all three applications folder were empty. Now, the fluxv1 one shows something inside
  • The Server system, had all three applications folders filled the last days. Now, all of them are empty.
    Strange.

In case you need any further log file or anything tested, please let me know.

Best regards,
Sascha

Hey again,

meanwhile I get the feeling, that the setup process is not finished after the welcome screen appears.
As I did not check the result directly after the installation has been finished in the first one or two days, I maybe did not realize, that there are still ongoing steps and processes with some delays.

Playing around with the environment a bit more today, I did a kubectl output more than two hours ago and compared it with a fresh one. I´ll add them as a pdf, as there is no chance to add them as text.
Also, I´m not allowed to add more than three posts in a row w/o any answer in between…strange…

kubectl_outputs.pdf (124.3 KB)

As you can see, the second output shows some of the missing services. Now, the first service IPs are coming, e.g. argocd-staging-nginx or pet-clinic.
Meanwhile, these IP addresses are usable inside the browser and the corresponding pipelines are appearing automatically in Jenkins.

I did nothing, just let the system run.

But what does this mean? I added 6 virtual CPUs and 6GB memory to my both test systems.
How many resources are needed for this playground? Unfortunately, I did not really find an answer for this.

Another question, several of the status and service information are in “pending” state.

Checking top, I see that the swap space is nearly out, but no load - well not at the moment or last minutes:
grafik

But at the end I see now the same behavior in ArgoCD as before in the last days - what brings me to the next question: after starting the playground, all fields are green and in sync inside ArgoCD.
But understanding now, that at this time not everything is up as expected maybe, that overview seems not to show the truth right now, or am I wrong?

Now, in that kubectl output state with all these pendings, I see that information inside ArgoCD view:

Coming back to the kubectl output - aside the services in staging state, I would also expect services with or for production stage, right? Otherwise, it would not make sense for me, why some of the IPs are available as described in your manual and some are not so far (all production ones are missing atm).

How to move forward? Do I have to wait one or two further days that everything is active, up and running? How can I check that the “normal” playground situation is reached and I can start playing with it?
Do I have to add more CPU´s or memory? What about the swap area, is that one maybe also too small? I saw warning talking about low swap area inside Jenkins.

What about moving to another k3s environment instead of using one K3d system which has to handle everything inside with docker?
May I ask for some information regarding the specifications for a k3s environment?

Just let me know if you need any log file or further information.

Many thanks and best regards,
Sascha

Hey Sascha,

thank you for your extensive feedback and effort on the playground.

Since there is a lot information in your posting I refer first of all to your question about the state of the applications and the playground itself:

In general the setup process is finished after the welcome screen prints; when this happens there should be working and configured applications. What should happen afterwards is that the jenkins discovers all the applications within its configured folder-jobs and builds them which then leads to deployments on the configured cluster.

In general the jenkins build-folders should auto discover the applications in their namespace (unfortunately we cannot rely on that). Which means, the empty job folders you see (e.g. argocd ) require a manual click on the Scan Namespace Now button and then will pick up all the builds underlying.

Secondly your question about the production apps - to get your code into production stage you have to manually approve the pull-requestson the corresponding gitops repo. This is the default behaviour of our gitops-build-lib. Of course you can create builds that deploy into production automatically like it is done to staging, but the default behaviour is to manually approve a PR.

Piece by piece we will address further of your topics as we may have a look into it further

Kind regards

1 Like

Hey @khedr0n

sorry for the late response and thank you for your provided log file.

What should happen is that the jenkins gets configured within the setup process (e.g. install necessary plugins, setting up global variables, creating job folders, …). In your case there seems to be strange failures when setting env_vars - the curl just fails with unauthorized.

The bad is that we could not reproduce it yet - but the good, I’m going to try it out on a vm trying to be as close to your environment as possible. Therefore I’d be happy if you can provide more information on your system, for now I know:

  • Ubuntu 20.04
  • Docker via snap

Kind regards

Hi,

I am running a Ubuntu VM on my Mac using Parallels. Please find a list detailed system information below:

OS: Ubuntu 20.04.3 LTS x86_64
Host: Parallels Virtual Platform Non
Kernel: 5.11.0-37-generic
CPU: Intel i5-8259U (2) @ 2.304GHz
GPU: 01:00.0 Red Hat, Inc. Virtio GP
Memory: 923MiB / 1978MiB

k3d version v4.4.7
k3s version v1.21.2-k3s1 (default)
docker version 20.10.8, build 3967b7d28e (installed from Ubuntu Software Center)
BD1909A079F397CEDD1FC0581AAE1C2D

Hope that helps! Best regards,

Martin

Hey @sbiallas,

regarding your note about the services in a pending state

Another question, several of the status and service information are in “pending” state.

This is due to the fact that we support local as well as remote clusters. And for convenience reasons, all those services are from type loadbalancer. But k3d can only bind one service to the external ip-address port 80. We could use NodePort instead, but this would add more complexity. Therefore we decided to live with that smell for now.


Hey @khedr0n,

I wasnt yet able to reproduce the error. But you could help me out by trying to create another instance of the playground and append the --trace flag instead of just --debug, like so:

bash <(curl -s \
  https://raw.githubusercontent.com/cloudogu/gitops-playground/main/scripts/init-cluster.sh) \
  && sleep 2 && docker run --rm -it -v ~/.k3d/kubeconfig-gitops-playground.yaml:/home/.kube/config \
    --net=host \
    ghcr.io/cloudogu/gitops-playground:58382b7 --yes --trace > playground.log
1 Like

Hi,

I ran the proposed command several times during the weekend. I didn’t change anything in my environment between the single runs but observed different behaviours.

Please find two different log files attached. Hope that helps!
20211010_2015_playground.pdf (847.1 KB)

20211010_1711_playground.pdf (994.3 KB)

Best regards,

Khedron

Hey @khedr0n,

sorry for my late response and thank you for your provided log files. I assume that the run with the log 20211010_1711_playground.pdf worked well? As it looked like everything just works fine? Or did it also resulted in any unexpected / erroneous behaviour?

For the other run, we identified an issue with the retrieval of a crumb from jenkins to be able to do post requests against jenkins. Unfortunately there was an error in our script, that it did not log the response nor it just failed with error but instead just continued to run the script which lead to more failures. Sadly I was still not able to reproduce locally.

One of my colleague created a PR which introduced a more robust way to retrieve the crumb and also implemented a retry-mechanism and logging. Also fails early with error when it cannot retrieve a crumb.

You could give it a try with the new image:

bash <(curl -s \ 
  https://raw.githubusercontent.com/cloudogu/gitops-playground/main/scripts/init-cluster.sh) \
  && sleep 2 && docker run --rm -it -v ~/.k3d/kubeconfig-gitops-playground.yaml:/home/.kube/config \
    --net=host \
    ghcr.io/cloudogu/gitops-playground:6c49457 --yes --trace > playground.log

Disclaimer: This PR has not yet been reviewed and therefore may undergo some more changes. It passed our build and e2e which resulted in a build of an image.


@sbiallas I just wrote about a new PR that improves the crumb handling, but it also introduces some more changes that e.g. tackle the pending pods / services problem:

  • Avoid confusing users with pending pods by deactivate service-lb and traefik.
  • Avoid “Pending” services and “Processing” state in argo by setting to NodePort Services on local clusters
  • Avoid error cascade when getting crumb from Jenkins fails (check error code, print HTTP code, retries, eventually exit on error)

If you want to have a look into the pr :computer:

1 Like

Hey @khedr0n

we have just shipped a smaller patch to our gitops-playground - one part of the patch should tackle the issues you got. Some of us were able to reproduce this issue a few times but it was not deterministic nor reliably reproducible.

We assume the issue is based on a insufficient health check performed against jenkins before starting with the configuration. Jenkins in general can be reached but some of the resources are not yet ready. Therefore the first request fails.

We hope this fixes the issue you got most of your runs if not all and looking forward to your feedback.

Kind regards

Hi,

unfortunately it took a while to find some time to read and follow all your writing.
But more and more I understand the behavior of the staging->production functionality.
It seems, that I was a part of the problem - not able to understand directly where to find the right position to move to production via Pull Request.

As I don´t just wanted to come back without checked everything on my side really carefully, it took some time. After spending some tries and read everything twice, three times … maybe more, I heard the “click”.
In addition, I became aware that I expected some of my modification behind the wrong IP address - my bad.

Many thanks for all your writing and effort. Additionally, the cluster situation looks much better now listing all namespaces- thanks for that.

As I got a much better understanding now, next stage will be to integrate the playground into my running ecosystem and an existing k3s cluster.
Do you see any issue I maybe could run into directly where you maybe have a hint?

One last information, it seems that the “petclinic-helm” fails quite often during the initial setup.
But after everything has been finished and the load reduces to normal situation, the next build works fine.

Unzipping /home/jenkins/.m2/wrapper/dists/apache-maven-3.6.3-bin/1iopthnavndlasol9gbrbg6bf2/apache-maven-3.6.3-bin.zip to /home/jenkins/.m2/wrapper/dists/apache-maven-3.6.3-bin/1iopthnavndlasol9gbrbg6bf2
Set executable permissions for: /home/jenkins/.m2/wrapper/dists/apache-maven-3.6.3-bin/1iopthnavndlasol9gbrbg6bf2/apache-maven-3.6.3/bin/mvn
[ERROR] Error executing Maven.
java.lang.NullPointerException
at java.base/java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011)
at java.base/java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006)
at java.base/java.util.Properties.put(Properties.java:1337)
at java.base/java.util.Properties.setProperty(Properties.java:225)
at org.apache.maven.cli.MavenCli.populateProperties(MavenCli.java:1653)
at org.apache.maven.cli.MavenCli.properties(MavenCli.java:597)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:279)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:193)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:347)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.maven.wrapper.BootstrapMainStarter.start(BootstrapMainStarter.java:39)
at org.apache.maven.wrapper.WrapperExecutor.execute(WrapperExecutor.java:122)
at org.apache.maven.wrapper.MavenWrapperMain.main(MavenWrapperMain.java:61)
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
ERROR: script returned exit code 1
[Pipeline] // catchError
[Pipeline] junit
Recording test results
[Checks API] No suitable checks publisher found.
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
Finished: FAILURE

Many thanks and best regards,
Sascha

Hi,

Apologies for the late response.

I tried several things like removing the downloaded docker images, deleting config files, etc. But I still can’t see any jobs in Jenkins. So my next approach will be to delete the VM and setup a new one. If this won’t be successful I will try to run the playground on a non-VM Linux machine.

I’ll keep you updated.

Best regards,

Khedron

Hey Khedron,

did you manage to solve the problem by setting up a new VM?

Let us know, if you still need help. :slight_smile:

Best regards,
Maik