Vocabulary 2020-05-29

„to learn the ropes“

Meaning:

To learn the basics of how something is done

Example:

You‘ll be fine once you learn the ropes

„in a nutshell“

Meaning:

In summary, or in as few words as possible

Example:

In a nutshell, the meeting was about Cloudera

„the big picture“

Meaning:

The overall view, or the situation as a whole

Example:

If you look at the big picture, the meeting went quite well

„to go back to the drawing board“

Meaning:

To start over, or go back to the first stage of a project

Example:

The boss hates it. We have to go back to the drawing board

Homework for 2020-06-05

Assigned to Alexander & Christine

  • You can choose one of the two articles (or both 🙂 )
  • you can work together or just by yourself
  • next week you will be presenting a short summary (5 -10 Minutes)
  • you can just tell us what it is all about, or also use screen sharing/presentation
  • you will lead into and moderate questions of the audience afterwards

Article 1:
AI Keeps Mastering Games, But Can It Win in the Real World?
The challenges of moving bots off the chess board and into the mess of life

https://www.theatlantic.com/technology/archive/2018/02/ai-keeps-mastering-games-but-can-it-win-in-the-real-world/554312/

Article 2:
AI-Driven Dermatology Could Leave Dark-Skinned Patients Behind
Machine learning has the potential to save thousands of people from skin cancer each year—while putting others at greater risk.

https://www.theatlantic.com/health/archive/2018/08/machine-learning-dermatology-skin-color/567619/

If the link is not reachable, I put the articles as pdf on our file server:

http://files.pbao.de/index.php/s/zQALi4PJG2coKEz

centOS: differences between 6 and 7

RHEL Manjor differences (v2)
~~~~~~~~~~~~~~~~~~~~~~~
RHEL6:
Kernel: 2.6
System-V (als init daemon)
iptables (firewall, backend)
iptables (firewall front end tool)

RHEL7:
Kernel 3.10
SystemD (als init daemon)
iptables (firewall, backend)
firewalld (firewall daemon to manage the rules; command line interface is firewall-cmd)

RHEL8:
Kernel 4.18
SystemD (als init daemon)
nftables (firewall, backend)
firewalld (firewall daemon to manage the rules; command line interface is firewall-cmd)


Package Manager
~~~~~~~~~~~~~~~~~~~~~~~
yum in RHEL6,7,(8) as package manager


Check and disable firewall
~~~~~~~~~~~~~~~~~~~~~~~

# the vm is protected by an amazon firewall, so we dont need on on our local maschine one
# dont forget to change your security group (inbound connection) to your current inet ip address

yum -y install firewalld
systemctl disable firewalld.service
systemctl status firewalld.service

iptables -v -n -L
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

ip6tables -v -n -L
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination



# if rules apply or the default of a chain is not ACCEPT:
iptables --flush
ip6tables --flush


Disable SELINUX
~~~~~~~~~~~~~~~~~~~~~~~
#sestatus to check if its enabled
sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      31

# disable selinux to avoid an extra layer of confguration and setup problems
grep 'SELINUX=' /etc/selinux/config
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config 
grep 'SELINUX=' /etc/selinux/config




Disable THP (Transparent Huge Pages)(Optional)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vi /etc/systemd/system/disable-thp.service


[Unit]
Description=Disable Transparent Huge Pages (THP)

[Service]
Type=simple
ExecStart=/bin/sh -c "echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled && echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag"

[Install]
WantedBy=multi-user.target


sudo systemctl daemon-reload

sudo systemctl start disable-thp
sudo systemctl enable disable-thp



Update VM Swappiness
~~~~~~~~~~~~~~~~~~~~
sysctl vm.swappiness
sysctl -w vm.swappiness=1

echo "vm.swappiness=1" >> /etc/sysctl.d/99-sysctl.conf

#cat /proc/sys/vm/swappiness
#/etc/sysctl.conf add line vm.swappiness=1
#echo "vm.swappiness = 1" >> /etc/sysctl.conf 

#cat /proc/sys/vm/swappiness

Expand EBS Volume OR HVM RESIZE - CENTOS 7 with HVM doesnt need resize
~~~~~~~~~~~~~~~~~
 df -h
 sudo resize2fs /dev/xvde
  sudo resize2fs /dev/xvda1


Install Other Packages
~~~~~~~~~~~~~~~~~~~~~~

# useful tools
yum -y install wget
yum -y install yum-utils
yum -y install unzip

# update os
yum -y update


Network Time Protocol
~~~~~~~~~~~~~~~~~~~~~
# use chronyd not ntpd
# its the default and its a modern implementation for network time exchange
systemctl status chronyd.service


Image our Choice
~~~~~~~~~~~~~~~~~~~~~
# our Image
CentOS 7 (x86_64) - with Updates HVM Image
https://aws.amazon.com/marketplace/pp/B00O7WM7QW
ami-0affd4508a5d2481b

2020-05-29 Module 4: Agenda

Week 4: Module 4 „Installation Cloudera Teil 1“

10:00 – 10:40 (0:40)
Virtual Classroom
– Welcome
– Presentation AMI with centOS 7 vs centOS 6.5
– Selection of 2 participants to presents and discuss the Digital Content

10:40 – 11:34 (0:54)
Digital Content
4.2 Install – Cloudera Data Hadoop (CDH) Quick Install (18‘)
4.3 Cloudera Installation Phases and Paths (2‘)
4.4 Cloudera Manager Introduction and Overview (4‘)
4.5 Cloudera Parcels (2‘)
4.6 Cloudera Repository Setup with Apache httpd (10‘)
4.7 Cloudera Installation Path B with local repository – AMI Prepare (18′)

11:34 – 12:00 (0:26)
Break/Lunch

12:00 – 13:00 (0:60)
Virtual Classroom
– Presentation and discussion (0:45)
– – content of the Digital Content
– – Q&A
– – Differences between the video and how we will do it
– Needed support (walkthrough etc)

13:00 – 13:05 (0:05)
Break

13:05 – 13:35 (0:30)
Virtual Classroom
– Vocabulary
– Topics of Daily Business

13:35 – 13:55 (0:20)
Open Discussion & Goodbye
– Off Topics
– Homework for next week

13:55
End

2020-05-22 Meeting Notes

To Do: Thorsten

How to fill the first 10 minutes of a meeting?
Guideline and idea collection of common phrases, typical starters, etc. to easier switch from German to English and give the meeting a rolling start.

Decisions:

We decided on using centOS 7 for buidling our own cloudera setup and follow the video course.
Differences between the systems (commands, behaviour, …) will be explained and supported by some of our paticipants.

Good To Know:

Swapiness
0- off
1- min
100- max (aggresive)

Things we talked about:

Finding an image on AWS AMI besides using Google 🙂 amd the ami ID

AWS instances
are 96 CPUs enough for your data center? or do we need to have our own hardware?

2020-05-22 Modul 3: Agenda

Woche 3: Modul 3 „Hadoop“

10:00
Virtual Classroom 20 Min.
– Begrüßung, dringende Fragen, Technik

10:20
Digital Content 14 Min.
– – 2.8 AWS – EC2 Spot Instances (5′)
– – 2.9 AWS – Relational Data Service (RDS) (9‘)

10:34
Pause: 6 Min.

10:40
Virtual Classroom 55 Min.
– Diskussion was ist bekannt, was wurde verstanden (10‘)
– Zusammenfassung Abschnitt 2 (Modul 1 und 2) (20‘)
– Anwendung, Probleme, Erfahrungen, Best Practice (25‘)

11:35
Pause: 25 Min.

12:00
Digital Content 37 Min.
– 3.0 Abschnitt 3: Hadoop Foundation on HDFS and YARN
– – 3.1 HDFS – Hadoop Distributed File System (4′)
– – 3.2 YARN – Yet another Resource Negotiator (3′)
– – 3.3 MySQL Database setup and Installation Materialien (15′)
– 4.0 Abschnitt 4: Cloudera Installation – Repository setup, httpd, path B.
– – 4.1 Prepare AWS AMI for Cloudera Installation Materialien (15′)

12:37
Pause: 3 Min.

12:40
Virtual Classroom 60 Min.
– Diskussion was ist bekannt, was wurde verstanden
– Fehlende Vokabeln
– Zusammenfassung Abschnitt 3
– Anwendung, Probleme, Erfahrungen, Best Practice
– Hausaufgabe für das nächste Modul (2TN)

13:40
Open Discussion & Goodbye 20 Min.

14:00
End