Commit Graph

239 Commits

Author SHA1 Message Date
Craig Sterrett
4556ca4ae9 Added --force flag to swupd repair command
Running setup_system.sh on a system setting the OS version to keep setup_system from upgrading the OS causes an error because a package has been removed. Need to add the --force flag to the
sudo swupd repair -m "${CLR_VER}" --picky command

Closes issue #297
2020-01-09 13:02:22 -08:00
Justin Scott
00c1d60470 Update kubeadm.yaml to 1.17 version (#296)
Closes #295

Signed-off-by: Justin Scott <justin.a.scott@intel.com>
2020-01-08 12:43:57 -08:00
Khanak Nangia
07c2231e62 Updating ingress-nginx to v0.26.1 (#285)
* Updating ingress-nginx to v0.26.1

* removing extra line
2019-12-09 10:44:25 -08:00
Morales Quispe, Marcela
4ef8d34671 Make and rename net server process variables configurable
Some CNIs takes longer for its related deployments to become ready, that
is why `proc_wait_time` needs to be customized. Now `proc_wait_time` can
be set at execution time and has a default value too for time to pod
network test harness.

Signed-off-by: Morales Quispe, Marcela <marcela.morales.quispe@intel.com>
2019-12-09 14:56:29 +00:00
Khanak Nangia
bc0f257176 Updating metrics to v0.3.6 (#286) 2019-12-05 14:47:46 -08:00
Khanak Nangia
e70e32d36e Updating MetalLB to v0.8.3 (#284) 2019-12-05 14:40:11 -08:00
Khanak Nangia
518fa87f27 Updating cilium to v1.6.4 (#289) 2019-12-05 14:39:46 -08:00
Khanak Nangia
6cd87d74be Updating rook to v1.1.7 (#283) 2019-12-05 12:36:42 -08:00
Morales Quispe, Marcela
927ceddc9c Add time to pod network metric.
To measure the time to pod network, a deployment that uses agnhost
image is used, which get exposed as a net server and replies to curl
calls, the test measure this reply time and saves it for further reporting.
Then, only the exposed net service gets deleted.

Signed-off-by: Morales Quispe, Marcela <marcela.morales.quispe@intel.com>
2019-12-05 11:00:41 +00:00
Graham Whaley
6298cf2054 metrics: tidy: widen the graphs
Move the legends to the bottom (underneath) for the tidy scaling graphs
to make them wider on the page, and thus easier to read with more
resolution.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-12-04 10:02:36 +00:00
Graham Whaley
acf5a95177 metrics: tidy: move local assign inside loop
The bootdata assignments were outside the 'valid file' check loop,
which meant in the case there was a data directory which did not
contain a valid scaling file, we would fail the assignment (as the
`local_bootdata` would be empty).

Fix by moving the assignments into the loop, thus only assigning when
we know we have valid data.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-12-04 10:02:36 +00:00
Graham Whaley
68a62f50bc metrics: report: improve interface y axis divs
Most of the time we have 0 interface errors or drops, so we pin the y
scale to '1', so we don't hit 'infinity' errors. That left us with a
strange y-axis label anomoly - as the axis was automatically divided
into 5 labels, and we got for some reason the sequence '0,0,0,1,1'.
That just plain looked wrong and confusing.
Fix it by using `pretty_breaks()` for the error/drop y axis, whilst
maintaining the `comma` count for the pod count y axis.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-12-04 10:02:36 +00:00
Graham Whaley
9431dd9f38 metrics: report: shrink page margins for more resolution
The pdf output by default has large page margins, which wastes a lot of
page space, and reduces our 'resolution'. Shrink the margins to a pretty
minimal 1cm to increase the graph resolution. The document itself then
does not look as 'pretty', but we can see more data visually.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-12-04 10:02:36 +00:00
Graham Whaley
9b8c7c093f metrics: collectd: move legends under graphs
Move the legends under the graphs to give more width, and thus
resolution, to the final pictures.
This works well for the collectd graphs as they are spread out
into sets of single column graphs per page.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-12-04 10:02:36 +00:00
Syed Ahsan
bdcb4fb5b7 Update Kata to v1.9.1 (#282)
This patch updates Kata to use v 1.9.1 and adds the kustomization.

Signed-off: Syed Ahsan <syed.ahsan.shamim.zaidi@intel.com>
2019-12-02 16:34:10 -08:00
Eric Ernst
a0ca2a2017 set the snapshotter to devmapper in setup script
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
2019-11-25 13:29:13 -08:00
Obed N Munoz
46b3f230ee scaling: Remove tty parameter in report's generation cmd
This is in order to avoid tty-related issues in our CI systems
which by default is not supporting tty. With this change we'll
avoid the following faling report's generation `docker run` command.
```
the input device is not a TTY
```

Signed-off-by: Obed N Munoz <obed.n.munoz@intel.com>
2019-11-22 09:17:03 +00:00
Graham Whaley
b6c7cf1b8e metrics: make pods_per_gb valid JSON
Under some circumstances, the pod_per_gb value would come out as <0,
and be generated without any leading 0's (such as `.14` rather than
`0.14`). This is not valid JSON, and would break the report generation
parsing in R.

Use `printf` to force a leading 0 prefix onto the value.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-11-21 08:48:40 -07:00
Syed Ahsan
f146c771cc Using consistent ENV variable for RUNNER (#273)
Earlier there was confusion between CLRK8S_RUNNER and RUNNER in
setup_system.sh and create_stack.sh script, this patch fixes this
and now the variable can either be RUNNER/CLRK8S_RUNNER.

Signed-off: syed.ahsan.shamim.zaidi@intel.com
2019-11-20 15:35:18 -08:00
Graham Whaley
9510b068e0 metrics: cpu-load: save cpu-load config in JSON
If the cpu-load function is enabled, save its config settings into the
JSON results file.
This required a little bit of re-sequencing of the json library calls,
to ensure we did the init of the JSON early enough, but not more than
once.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-11-15 12:23:38 -06:00
Graham Whaley
952e037420 metrics: enable cpu-load ability across tests
Rejig the framework a little to unify the init/shutdown calls and code,
which allows us to add the cpu-load enable/disable ability to all the
existing metrics.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-11-15 12:23:38 -06:00
Graham Whaley
c846e9753d metrics: add cpu-load generator code and docs
Add library code that can generate a variety of cpu loads across the
cluster. Configuration is via environment variables, documented in the
.md file.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-11-15 12:23:38 -06:00
Graham Whaley
39c7cc643a metrics: README: add some information about stats gathering
Add the `collectd` subdir to the top level description of the code
layout, and add some developer details about how and where the stats
code lives and is configured/enabled.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2019-11-15 12:23:38 -06:00
CraigSterrett
54adf53cdd Adding non-routable IPs to no_proxy (#270)
Adding non-routable IPs to the no_proxy, non-routable addresses will
never go out the proxy, and are used as internal IPs by VM's running
cloud-native setup. Without the addition of the IP's in no_proxy,
kubernetes nodes will not be able to communicate to the Kubernetes IP as
the traffic will be routed out the proxy server.

Signed-off-by: Craig Sterrett <craig.Sterrett@intel.com>
2019-11-14 11:18:59 -08:00
CraigSterrett
00df885b45 Fixed code for no_proxy in setup_system.sh (#265)
The sed command for updating no-proxy settings in the proxy.sh file was
not working. I made it match the above command which was working for
/etc/environment.

Signed-off-by: Craig Sterrett <craig.sterrett@intel.com>
2019-11-13 09:48:59 -08:00
David Lyle
a76cc3437e Specify hostname for file name for collectd
On some machines the value for hostname, that is used in naming
the csv directory, isn't always determined by collectd to be
localhost. The scaling code assumes it will always be localhost.

This patch specifies the hostname to be localhost.

Signed-off-by: David Lyle <dklyle0@gmail.com>
2019-11-13 09:25:33 +00:00
David Lyle
e09285f1e1 support older network interface naming
KIND uses the older network interface naming standard. Other
operating system images may as well. Adding support for 'eth'
network interface naming prefix.

Signed-off-by: David Lyle <dklyle0@gmail.com>
2019-11-12 16:51:08 -06:00
Eric Ernst
b4e6813ed6 devmapper: update to be functional, usable
base_image_size is effectively the max size of the thinpool snapshot,
and this is static (does not resize).  If you are running containers
with larger individual layers, this will fail. (elastic is a good test
for this).

The thinpool should be 10GB, not 1GB (to align with the .img's defined
earlier in file).

Fixes: #266

Signed-off-by: Eric Ernst <eric.ernst@intel.com>
2019-11-11 09:49:35 -08:00
David Lyle
7efe99f139 disregard collectd tail data from stats
Since collectd is started before the pods are launched and
shutdown after the last pod is launched, we gather data outside
the pod launch window which can adversely influence the per pod
launch stats. This is especially true after the last pod launches
as all the pods are then deleted before collectd stops collecting
metrics.

This patch isolates the collectd data used to only
coincide with the pod launch window.

And additional change in this patch is to improve the secondary
y axis scaling. There was an ill-advised check in previously to
force the scale to be at least 1. This does not work well when
the pod number is significantly higher than say 100 (the max
possible cpu idle value).

This patch changes the scaling to be across all data to be graphed.
The special condition for interface drops and interface errors,
where the data is typically 0. We don't scale by 0.

Signed-off-by: David Lyle <dklyle0@gmail.com>
2019-11-06 14:31:56 +00:00
Khanak Nangia
103bfcc681 updating the deprecated APIs for K8s v1.16 (#259) 2019-11-05 14:15:59 -08:00
Khanak Nangia
9574f44b20 Cleaning containerd data in reset_stack script (#260) 2019-11-05 14:15:46 -08:00
Morales Quispe, Marcela
f7254e2b30 Edit test description for k8s_scale and k8s_parallel tests.
Signed-off-by: Morales Quispe, Marcela <marcela.morales.quispe@intel.com>
2019-11-05 09:15:44 +00:00
David Lyle
df0af2ab2c fixing makereport container build
A recent change to debian apt repositories led to build errors
for the report container. The error was around stretch release
files. The container image we are based on published an update
which fixes this error. This patch updates to use :latest to
avoid errors when building.

Signed-off-by: David Lyle <dklyle0@gmail.com>
2019-10-30 09:09:39 +00:00
Justin Scott
e985ffd6e0 Kustomize kata-deploy image to stable version (#254)
Signed-off-by: Justin Scott <justin.a.scott@intel.com>
v1.8
2019-10-28 15:25:28 -07:00
Justin Scott
7840d720b0 Add CLRK8S_CLR_VER to Vagrantfile (#253)
This adds env var to the Vagrantfile so that Clear version can be
specified.

Signed-off-by: Justin Scott <justin.a.scott@intel.com>
2019-10-28 15:03:38 -07:00
Syed Ahsan
43fa8bad8f Adding Kustomization for Cilium (#246)
This patch adds kustomization path and file for Cilium

Signed-off: Syed Ahsan <syed.ahsan.shamim.zaidi@intel.com>
2019-10-28 14:08:51 -07:00
Justin Scott
da6087762a Add CLRK8S_CLR_VER to setup_system.sh (#250)
This adds ability to upgrade/downgrade to desired Clear version.
The default behavior remains unchanged (auto updating to latest)

Signed-off-by: Justin Scott <justin.a.scott@intel.com>
2019-10-28 10:16:57 -07:00
Justin Scott
5b03651467 Update Canal to v3.10 (#247)
Signed-off-by: Justin Scott <justin.a.scott@intel.com>
2019-10-25 14:45:32 -07:00
Khanak Nangia
598289573b updating readme with increased CPUS and MEMORY value for vagrant VMs (#242) 2019-10-25 12:07:36 -07:00
Justin Scott
6972963363 Disable git checkout advice in create_stack.sh (#244)
This PR disables the git advice message seen for
each component as its installed.

Signed-off-by: Justin Scott <justin.a.scott@gmail.com>
2019-10-25 08:22:43 -07:00
Syed Ahsan
d271a73fe3 Adding Global Functions outside (#243)
This patch adds Global functions versions outside
the create_stack.sh file.

Signed-off: Syed Ahsan <syed.ahsan.shamim.zaidi@intel.com>
2019-10-25 08:19:06 -07:00
Syed Ahsan
674ab84d0d Adding Flannel to CNI (#239)
This patch add's Flannel to CNI, export the environment variable
to use Flannel, and uses crio as default unless the user
specify otherwise.

Signed-Off: Syed Ahsan <syed.ahsan.shamim.zaidi@intel.com>
2019-10-24 18:05:44 -07:00
Justin Scott
3984a18b18 Add node-feature-discovery (#236)
This adds the node-feature-discovery component.

Partialy implements #32

Signed-off-by: Justin Scott <justin.a.scott@intel.com>
2019-10-24 16:35:47 -07:00
Justin Scott
02880b82ee Add update_checker.sh (#238)
This adds a script that parses create_stack for component versions
and URLs and compares them to the latest versions available.

Signed-off-by: Justin Scott <justin.a.scott@intel.com>
2019-10-24 15:56:28 -07:00
Khanak Nangia
88b899d7e3 increasing the CPUS and MEMORY default value for vagrant VMs (#240) 2019-10-24 14:58:06 -07:00
Julio Rivera
2ae8360760 Adding Cert sans to kubeadm config (#188)
* Adding Cert sans to kubeadm config

This patch add the possibility to pass a list of ips or names to the certSans
property in the kubeadm.yml file.

Signed-off-by: Rivera Gonzalez, Julio C <julio.c.rivera.gonzalez@intel.com>

* Make idempotent addition of certSANs

Signed-off-by: Rivera Gonzalez, Julio C <julio.c.rivera.gonzalez@intel.com>
2019-10-24 14:45:30 -07:00
Syed Ahsan
aa86c554e8 Adding Cilium to create_stack (#218)
This change adds Cilium to create_stack script, and uses crio
by default unless the user otherwise speicify

Signed-Off: Syed Ahsan <syed.ahsan.shamim.zaidi@intel.com>
2019-10-24 12:00:32 -07:00
CraigSterrett
ad2fc108d3 Added .svc to no_proxy (#235)
Added .svc to the no_proxy in setup_system.sh to keep Kubernetes service
requests from being routed to the proxy. This fixes and issue that
Humberto was seeing in his testing.

Signed-off-by: Craig Sterrett <craig.sterrett@intel.com>
2019-10-24 00:44:05 -07:00
Julio Rivera
4edbaebf87 Removes kustomize from kubeadm init (#222)
This commit drops kustomize support from kubeadm config file.

Signed-off-by: Rivera Gonzalez, Julio C <julio.c.rivera.gonzalez@intel.com>
2019-10-23 23:43:12 -07:00
David Lyle
e7b7d33be0 handle multiday collectd data
The collectd csv plugin starts a new file for each day that data is
being recorded. Currently, collectd_scaling.R only reads from the
first day's file. This leads to incomplete data being rendered in
the report charts. All the data files are collected and present,
they just need to be read.

This patch makes changes to read all the days of collectd data and add
them to the data set.

Signed-off-by: David Lyle <dklyle0@gmail.com>
2019-10-23 12:02:59 -05:00