Running setup_system.sh on a system setting the OS version to keep setup_system from upgrading the OS causes an error because a package has been removed. Need to add the --force flag to the
sudo swupd repair -m "${CLR_VER}" --picky command
Closes issue #297
Some CNIs takes longer for its related deployments to become ready, that
is why `proc_wait_time` needs to be customized. Now `proc_wait_time` can
be set at execution time and has a default value too for time to pod
network test harness.
Signed-off-by: Morales Quispe, Marcela <marcela.morales.quispe@intel.com>
To measure the time to pod network, a deployment that uses agnhost
image is used, which get exposed as a net server and replies to curl
calls, the test measure this reply time and saves it for further reporting.
Then, only the exposed net service gets deleted.
Signed-off-by: Morales Quispe, Marcela <marcela.morales.quispe@intel.com>
Move the legends to the bottom (underneath) for the tidy scaling graphs
to make them wider on the page, and thus easier to read with more
resolution.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
The bootdata assignments were outside the 'valid file' check loop,
which meant in the case there was a data directory which did not
contain a valid scaling file, we would fail the assignment (as the
`local_bootdata` would be empty).
Fix by moving the assignments into the loop, thus only assigning when
we know we have valid data.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Most of the time we have 0 interface errors or drops, so we pin the y
scale to '1', so we don't hit 'infinity' errors. That left us with a
strange y-axis label anomoly - as the axis was automatically divided
into 5 labels, and we got for some reason the sequence '0,0,0,1,1'.
That just plain looked wrong and confusing.
Fix it by using `pretty_breaks()` for the error/drop y axis, whilst
maintaining the `comma` count for the pod count y axis.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
The pdf output by default has large page margins, which wastes a lot of
page space, and reduces our 'resolution'. Shrink the margins to a pretty
minimal 1cm to increase the graph resolution. The document itself then
does not look as 'pretty', but we can see more data visually.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Move the legends under the graphs to give more width, and thus
resolution, to the final pictures.
This works well for the collectd graphs as they are spread out
into sets of single column graphs per page.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
This is in order to avoid tty-related issues in our CI systems
which by default is not supporting tty. With this change we'll
avoid the following faling report's generation `docker run` command.
```
the input device is not a TTY
```
Signed-off-by: Obed N Munoz <obed.n.munoz@intel.com>
Under some circumstances, the pod_per_gb value would come out as <0,
and be generated without any leading 0's (such as `.14` rather than
`0.14`). This is not valid JSON, and would break the report generation
parsing in R.
Use `printf` to force a leading 0 prefix onto the value.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Earlier there was confusion between CLRK8S_RUNNER and RUNNER in
setup_system.sh and create_stack.sh script, this patch fixes this
and now the variable can either be RUNNER/CLRK8S_RUNNER.
Signed-off: syed.ahsan.shamim.zaidi@intel.com
If the cpu-load function is enabled, save its config settings into the
JSON results file.
This required a little bit of re-sequencing of the json library calls,
to ensure we did the init of the JSON early enough, but not more than
once.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Rejig the framework a little to unify the init/shutdown calls and code,
which allows us to add the cpu-load enable/disable ability to all the
existing metrics.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Add library code that can generate a variety of cpu loads across the
cluster. Configuration is via environment variables, documented in the
.md file.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Add the `collectd` subdir to the top level description of the code
layout, and add some developer details about how and where the stats
code lives and is configured/enabled.
Signed-off-by: Graham Whaley <graham.whaley@intel.com>
Adding non-routable IPs to the no_proxy, non-routable addresses will
never go out the proxy, and are used as internal IPs by VM's running
cloud-native setup. Without the addition of the IP's in no_proxy,
kubernetes nodes will not be able to communicate to the Kubernetes IP as
the traffic will be routed out the proxy server.
Signed-off-by: Craig Sterrett <craig.Sterrett@intel.com>
The sed command for updating no-proxy settings in the proxy.sh file was
not working. I made it match the above command which was working for
/etc/environment.
Signed-off-by: Craig Sterrett <craig.sterrett@intel.com>
On some machines the value for hostname, that is used in naming
the csv directory, isn't always determined by collectd to be
localhost. The scaling code assumes it will always be localhost.
This patch specifies the hostname to be localhost.
Signed-off-by: David Lyle <dklyle0@gmail.com>
KIND uses the older network interface naming standard. Other
operating system images may as well. Adding support for 'eth'
network interface naming prefix.
Signed-off-by: David Lyle <dklyle0@gmail.com>
base_image_size is effectively the max size of the thinpool snapshot,
and this is static (does not resize). If you are running containers
with larger individual layers, this will fail. (elastic is a good test
for this).
The thinpool should be 10GB, not 1GB (to align with the .img's defined
earlier in file).
Fixes: #266
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Since collectd is started before the pods are launched and
shutdown after the last pod is launched, we gather data outside
the pod launch window which can adversely influence the per pod
launch stats. This is especially true after the last pod launches
as all the pods are then deleted before collectd stops collecting
metrics.
This patch isolates the collectd data used to only
coincide with the pod launch window.
And additional change in this patch is to improve the secondary
y axis scaling. There was an ill-advised check in previously to
force the scale to be at least 1. This does not work well when
the pod number is significantly higher than say 100 (the max
possible cpu idle value).
This patch changes the scaling to be across all data to be graphed.
The special condition for interface drops and interface errors,
where the data is typically 0. We don't scale by 0.
Signed-off-by: David Lyle <dklyle0@gmail.com>
A recent change to debian apt repositories led to build errors
for the report container. The error was around stretch release
files. The container image we are based on published an update
which fixes this error. This patch updates to use :latest to
avoid errors when building.
Signed-off-by: David Lyle <dklyle0@gmail.com>
This adds ability to upgrade/downgrade to desired Clear version.
The default behavior remains unchanged (auto updating to latest)
Signed-off-by: Justin Scott <justin.a.scott@intel.com>
This patch add's Flannel to CNI, export the environment variable
to use Flannel, and uses crio as default unless the user
specify otherwise.
Signed-Off: Syed Ahsan <syed.ahsan.shamim.zaidi@intel.com>
This adds a script that parses create_stack for component versions
and URLs and compares them to the latest versions available.
Signed-off-by: Justin Scott <justin.a.scott@intel.com>
* Adding Cert sans to kubeadm config
This patch add the possibility to pass a list of ips or names to the certSans
property in the kubeadm.yml file.
Signed-off-by: Rivera Gonzalez, Julio C <julio.c.rivera.gonzalez@intel.com>
* Make idempotent addition of certSANs
Signed-off-by: Rivera Gonzalez, Julio C <julio.c.rivera.gonzalez@intel.com>
This change adds Cilium to create_stack script, and uses crio
by default unless the user otherwise speicify
Signed-Off: Syed Ahsan <syed.ahsan.shamim.zaidi@intel.com>
Added .svc to the no_proxy in setup_system.sh to keep Kubernetes service
requests from being routed to the proxy. This fixes and issue that
Humberto was seeing in his testing.
Signed-off-by: Craig Sterrett <craig.sterrett@intel.com>
The collectd csv plugin starts a new file for each day that data is
being recorded. Currently, collectd_scaling.R only reads from the
first day's file. This leads to incomplete data being rendered in
the report charts. All the data files are collected and present,
they just need to be read.
This patch makes changes to read all the days of collectd data and add
them to the data set.
Signed-off-by: David Lyle <dklyle0@gmail.com>