Menu

Data Science Dev

Design, develop, deploy

Data science dev blog

Git Commands

Create a Repository (HTTP)

git clone http://<server address>/<username>/<repository>.git
cd <folder>
touch README.md
git add README.md
git commit -m "add README"
git push -u origin master

 

Existing Folder (HTTP)

cd existing_folder
git init
git remote add origin http://<server address>/<username>/<repository>.git
git add .
git commit -m "Initial commit"
git push -u origin master

Single Control-Plane Cluster with Kubernetes

 

Single Control-Plane Cluster with Kubernetes

Objective: Install Kubernetes on a Single Machine divided into Three Virtual Box Nodes with Single Control-Plane.

References:

  1. https://medium.com/@KevinHoffman/building-a-kubernetes-cluster-in-virtualbox-with-ubuntu-22cd338846dd
  2. https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
  3. https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
  4. https://docs.projectcalico.org/getting-started/kubernetes/quickstart
  5. https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

 

Part 1: Setup VirtualBoxes

  1. Install Ubuntu 18.04 LTS on your machine.
  2. Install VirtualBox on the machine
  3. Create a VirtualBox VM with Ubuntu 18.04 on it, with 2-4 GB of RAM and 2-4 CPU cores.
  4. Go to File->Host Network Manager and create a Host Network called vboxnet0 with IPv4 Address: 192.168.99.1 and IPv4 Network Mask: 255.255.255.0
  5. Set Network Adapter #2 of the VirtualBox VM to Host-only Adapter and set name to vboxnet0.
  6. Install Docker CE in the VirtualBox VM
  7. Turn the swap disk off (sudo swapoff -a and comment out line with swap in /etc/fstab)
  8. sudo apt update && sudo apt install -y openssh-server net-tools
  9. Power down the VirtualBox VM
  10. Clone the VirtualBox 3 times (Full clone, not linked).  Name one kubemaster, one worker1, and one worker2.
  11. Change worker1 and worker 2 to 1 cpu core and 2 GB memory if you wish depending on your resources.  Kubemaster must have at least 2 cores.
  12. Startup all three VMs (kubemaster, worker1, worker2).  You should have one left as backup in case something goes wrong.
  13. Change /etc/hostname in each of the VM’s to its respective name (kubemaster, worker1, worker2)
  14. Ifconfig -a and check to see the name of the host-only adapter (e.g. enp0s8)
  15. Add to the bottom of /etc/network/interfaces in each VM, where # is 0,1,2 for kubemaster, worker1, and worker2.  Also be sure to substitute your name for the adapter in enp0s8 if it is different:

auto enp0s8
iface enp0s8 inet static
address 192.168.99.2#
netmask 255.255.255.0
network 192.168.99.0
broadcast 192.168.99.255

 

  1.  Do a “sudo ufw disable” in each of the VMs and then reboot each.

 

Part 2: Install kubeadm, kubelet, and kubectl on each VM.

  1. Create a file /etc/sysctl.d/k8s.conf and put the following lines in it for each VM:

          net.bridge.bridge-nf-call-ip6tables = 1

          net.bridge.bridge-nf-call-iptables = 1

  1. Run the following commands on each VM:

          modprobe br_netfilter

          sysctl --system

  1. Perform the following commands for each VM, the last 4 commands may not work, but that’s OK:

sudo apt-get install -y iptables arptables ebtables

sudo update-alternatives --set iptables /usr/sbin/iptables-legacy

sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy

sudo update-alternatives --set arptables /usr/sbin/arptables-legacy

sudo update-alternatives --set ebtables /usr/sbin/ebtables-legacy

  1. Perform the following commands for each VM.

sudo apt-get update && sudo apt-get install -y apt-transport-https curl

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list

deb https://apt.kubernetes.io/ kubernetes-xenial main

EOF

sudo apt-get update

sudo apt-get install -y kubelet kubeadm kubectl

sudo apt-mark hold kubelet kubeadm kubectl

 

Part 3: Install Calico (or some Pod network add-on) and join the worker nodes to setup the cluster.

  1. In the kubemaster VM only use the following commands:

           sudo kubeadm init—apiserver-advertise-address=192.168.99.20 --pod-network-cidr=192.168.0.0/16

  1. Copy the command shown to join nodes and save it somewhere.
  2. In the kubemaster VM only, use the following commands:

           mkdir -p $HOME/.kube

           sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

           sudo chown $(id -u):$(id -g) $HOME/.kube/config

           kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

           watch kubectl get pods --all-namespaces

  1. Wait till all pods are in “Running” status and then hit ctrl-C
  2. In the kubemaster VM only, use the following commands:

           kubectl taint nodes --all node-role.kubernetes.io/master-

  1. In each of the worker nodes execute the command copied in step 2.
  2. You could reboot the machines at this point, I did, but it’s not necessary.

 

Part 4: Test the Cluster

  1. In the kubemaster, type:

kubectl get nodes -o wide

                You should see 3 nodes, 1 master (kubemaster), and 2 worker nodes (worker 1 and worker 2)

  1. Create a yaml file with the following contents:

apiVersion: apps/v1

kind: Deployment

metadata:

name: nginx-deployment

labels:

  app: nginx

spec:

  replicas: 9

  selector:

    matchLabels:

    app: nginx

template:

  metadata:

  labels:

    app: nginx

  spec:

    containers:

    - name: nginx

       image: nginx:1.14.2

    ports:

    - containerPort: 80

  1. Enter the following commands in kubemaster:

kubectl create -f <filename of yaml in step 2>

kubectl get pods -o wide

  1. You should see all the pods running (9 pods) on different nodes.
  2. To delete the deployment:

kubectl delete deploy --all

Install Tensorflow 1.14.0 on Raspberry Pi 3+ with Python 3.6.8

1. sudo su

2. apt install libhdf5-dev

3. python3.6 -m pip install --upgrade pip setuptools

4. python3.6 -m pip install --upgrade six

5. python3.6 -m pip install tensorflow==1.14.0

6. sudo nano /etc/dphys-swapfile

7. change CONF_SWAPSIZE=100 to CONF_SWAPSIZE=2048 an exit nano

8. sudo /etc/init.d/dphys-swapfile stop

9. sudo /etc/init.d/dphys-swapfile start

10. free -m

if you want other libraries:

11. sudo apt install -y libblas-dev liblapack-dev git cmake gfortran libatlas-base-dev libfreetype6-dev

12. sudo python3.6 -m pip install --upgrade testresources cython

13. sudo su; python3.6 -m pip install --upgrade scipy; exit

14. sudo python3.6 -m pip install --upgrade pandas

15. sudo python3.6 -m pip install --upgrade scikit-learn scikit-image imageio "dask[complete]" matplotlib

if using dask, sometimes you have to upgrade some python packages to make them match up on all your workers and master nodes, for example

16. sudo python3.6 -m pip install --upgrade distributed numpy pandas

Install Python 3.6.8 on Raspberry Pi 3+

1. sudo apt update

2. sudo apt install -y build-essential tk-dev libncurses5-dev libncursesw5-dev libreadline6-dev libdb5.3-dev libgdbm-dev libsqlite3-dev libssl-dev libbz2-dev libexpat1-dev liblzma-dev zlib1g-dev libffi-dev

3. wget https://www.python.org/ftp/python/3.6.8/Python-3.6.8.tgz

4. sudo tar zxf Python-3.6.8.tgz

5. cd Python-3.6.8

6. sudo ./configure

7. sudo make -j 4

8. sudo make altinstall

Installing Python Libraries on Jetson TX2 / Jetpack 4.2.2

Just a reminder for me on the steps to install certain libraries on Jetson TX2 with Jetpack 4.2.2.

1. Install Jetpack 4.2.2 with Tensorflow 1.14 checked.

2. sudo apt update

3. sudo apt install -y libblas-dev liblapack-dev git cmake gfortran libatlas-base-dev libfreetype6-dev  

4. sudo apt autoremove

5. sudo pip3 install --upgrade testresources setuptools six

6. sudo pip3 install --upgrade imageio pandas "dask[complete]" cython matplotlib

7. sudo su; sudo pip install --upgrade scipy; exit

8. sudo apt install python3-opencv

9. sudo pip3 install --upgrade scikit-learn scikit-image

Tools to Annotate Images

Here is a link to many manual tools for labeling and annotating images. 

https://en.wikipedia.org/wiki/List_of_manual_image_annotation_tools

Ratsnake seems good for image segmentation.

 

 

Docker on Ubuntu Tutorial

I experimented with Docker lately.  Here are a few commands, notes and issues I resolved while working on Ubuntu 16.04.  I'm assuming you have docker installed (sudo apt-get update && apt-get install docker-ce)


Tutorial on the web for "dockerizing" python applications.https://runnable.com/docker/python/dockerize-your-python-application


Some python example code I made to test with (imgtest.py).  This is the code I wanted to place in a docker image.  I basically makes a random image and saves it.

import numpy as np
from PIL import Image
import os
 
img = np.random.randint(0,255,size=(1024,768))
img = img.astype('float')
img = Image.fromarray(img)
img = img.convert('RGB')
img.save('/app/test.jpg')
 

Here is an example Dockerfile I used.  I saved as Dockerfile and placed in same folder as my python script.

FROM python:2
 
ADD imgtest.py /
 
RUN pip install numpy
RUN pip install pillow
RUN mkdir /app
 
CMD ["python","./imgtest.py"]

To build the docker image I moved to the folder where both the Dockerfile and python script reside.  And then typed:
 
sudo docker build .  

When building, had to add the DNS to the following (from https://stackoverflow.com/questions/28668180/cant-install-pip-packages-inside-a-docker-container-with-ubuntu).  I tried other methods mentioned, but none of them seemed to work except this one.  I also commented out the DockerOpts line in /etc/default/docker.

"For Ubuntu users

You need to add new DNS addresses in the docker config

sudo nano /lib/systemd/system/docker.service

Add the dns after ExecStar.

--dns 10.252.252.252 --dns 10.253.253.253

Should look like that:

ExecStart=/usr/bin/dockerd -H fd:// --dns 10.252.252.252 --dns 10.253.253.253

systemctl daemon-reload sudo service docker restart

"


After your build you should be able to list the docker images available using the following command:

sudo docker images

 
 
I wanted to run the docker image and make a container.  If I ran it with:
 
sudo docker run -it <image_id>
 
It would just execute, create the random image output in the container, and then, the container would disappear with no random image output persisted in storage.  Instead, I used a bind mount.  This bind mount, lets me connect a container's directory to a host's directory.  The resulting random image is placed on the docker host.
 
sudo docker run -d -it -v <host directory>:<container directory> <image_id>
 
So on my computer it would look something like:
 
sudo docker run -d -it -v /home/npropes/Desktop:/app  4a823423bca
 
The random image output should appear on my Desktop.

Additional commands below:

I can see the list of docker containers running or have run by using this command:

sudo docker ps -a
 
To stop docker containers use these commands:
 
sudo docker stop <container>
sudo docker kill <container>
 
To delete a docker container:
 
sudo docker rm <container>
 
To delete an docker image:
 
sudo docker rmi <image>
 
 
 

CNN + RNN TensorFlow Example Code

This is example code for a CNN + RNN structure used for analyzing time-series data.    There is a separate CNN structure for each time step of windowed data.  The RNN learns the time dependency between feature vectors extracted by the CNNs. 

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import math

# 1-d convolutional layer
def conv1d(X, num_filters=8, filter_width=3, stride=1, padding='SAME'):
    # helper function for a 1D convolutional filter
    # initalize filter
    window_size = int(X.get_shape()[1])
    num_sensors = int(X.get_shape()[2])
    stddev = 1
    f = tf.Variable(tf.truncated_normal((filter_width,num_sensors,num_filters),stddev=.2),trainable=True,name='conv1d_filter')
    # initialize bias
    b = tf.Variable(0.0,name='conv1d_bias')
    conv = tf.nn.conv1d(value=X,filters=f,stride=stride,padding=padding,name='conv1d_op')
    return tf.add(conv,b)

# print out graph structure
def print_graph():
    # prints the graph operations out
    with tf.Session() as sess:
        op = sess.graph.get_operations()
    for o in op:
        print o.outputs

# container to hold cnnrnn model structure
class cnnrnn_model:
    def __init__(self,time_steps,window_size,num_sensors,filters,filter_size,rnn_nodes):
        ###### model creation #############################################################
        # placeholders
        self.X = tf.placeholder(tf.float32,[None,time_steps,window_size,num_sensors],name='X')
        self.Y = tf.placeholder(tf.float32,[None,time_steps,1],name='Y')

        # create the convolutional layers for each CNN per time step
        m = []
        for i in range(0,time_steps):
            # batch, time_step, window_size, num_sensors
            m1 = conv1d(self.X[:,i,:,:], num_filters=filters*1, filter_width=filter_size, stride=1, padding='SAME')
            m1 = tf.nn.relu(m1,name='relu1d')
            m1 = tf.nn.pool(m1, window_shape=(4,), pooling_type='MAX', padding='SAME', strides=(4,), name='pool1d')
            m1 = conv1d(m1, num_filters=filters*1, filter_width=filter_size, stride=1, padding='SAME')
            m1 = tf.nn.relu(m1,name='relu1d')
            m1 = tf.nn.pool(m1, window_shape=(4,), pooling_type='MAX', padding='SAME', strides=(4,), name='pool1d')
            m1 = conv1d(m1, num_filters=filters*1, filter_width=filter_size, stride=1, padding='SAME')
            m1 = tf.nn.relu(m1,name='relu1d')
            m1 = tf.nn.pool(m1, window_shape=(2,), pooling_type='MAX', padding='SAME', strides=(2,), name='pool1d')
            m1 = conv1d(m1, num_filters=filters*1, filter_width=1, stride=1, padding='SAME')
            m1 = tf.nn.relu(m1,name='relu1d')
            m1 = tf.nn.pool(m1, window_shape=(2,), pooling_type='MAX', padding='SAME', strides=(2,), name='pool1d')
            sh1 = int(m1.get_shape()[1])
            sh2 = int(m1.get_shape()[2])
            m1 = tf.reshape(m1, [-1,1,sh1*sh2])
            m.append(m1)
            
        c = tf.concat(m,1)

        basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=rnn_nodes)
        model, states = tf.nn.dynamic_rnn(cell=basic_cell, inputs=c, dtype=tf.float32, time_major=False)

        self.model = tf.layers.dense(model,units=1,activation=None)

        self.loss = tf.losses.mean_squared_error(self.Y, self.model)

        optimizer = tf.train.AdamOptimizer(1e-3)
        self.training_op = optimizer.minimize(self.loss) 
        self.init = tf.global_variables_initializer()
        ###### end model creation #############################################################

# some simulated data to play with
def create_test_data(batch_size, time_steps, window_size, num_sensors):
    # create fake training data for testing neural net
    x = np.zeros((batch_size,time_steps,window_size,num_sensors))
    y = np.zeros((batch_size,time_steps,1)) # these are the outputs of the RNN + dense layer
    num_examples = batch_size + time_steps
    xe = np.zeros((num_examples,window_size,num_sensors))
    ye = np.zeros((num_examples,1))
    # normal case (no fault)
    for e in range(0,num_examples/2): # each example
        wn = 1
        d = 0.73
        c2 = -wn*wn
        c1 = -2*d*wn
        c3 = 1
        x1 = 0
        x2 = 0
        for s in range(0,window_size):  # each sample
            x1 = x1 + 0.4*x2
            x2 = x2 + 0.4*(c1*x2 +c2*x1 + c3)
            xe[e,s,0] = -x1*c2 + np.random.randn()*0.1
            xe[e,s,1] = x2 + np.random.randn()*0.1
            ye[e,0] = 1.0
    # fault case (damping coefficient changing)
    for e in range(num_examples/2,num_examples):
        i = e-num_examples/2
        wn = 1
        d = 0.72 - 0.3*float(i+1)/(num_examples/2)
        c2 = -wn*wn
        c1 = -2*d*wn
        c3 = 1
        x1 = 0
        x2 = 0
        for s in range(0,window_size):
            x1 = x1 + 0.4*x2
            x2 = x2 + 0.4*(c1*x2 +c2*x1 + c3)
            xe[e,s,0] = -x1*c2 + np.random.randn()*0.1
            xe[e,s,1] = x2 + np.random.randn()*0.1
            ye[e,0] = math.exp(-0.1*i)
    # reorganize data into timesteps
    for b in range(0,batch_size):
        for t in range(0,time_steps):    
            x[b,t,:,:] = xe[b + t,:,:]
            y[b,t,:] = ye[b + t,:]
    return x,y

###### model parameters ###########################################################
time_steps = 5
window_size = 64
num_sensors = 2 
filters = 6
filter_size= 3
rnn_nodes= 8

###### training parameters ########################################################
batch_size = 64
n_epochs = 501

###### create test data ###########################################################
trainX, trainY = create_test_data(batch_size, time_steps, window_size, num_sensors)

###### model creation #############################################################
model = cnnrnn_model(time_steps,window_size,num_sensors,filters,filter_size,rnn_nodes)

###### saver object to save and restore model variables ###########################
saver = tf.train.Saver()

###### model training #############################################################
with tf.Session() as sess:
    sess.run(model.init)       
    for e in range(0,n_epochs):
        sess.run(model.training_op, feed_dict={model.X: trainX, model.Y: trainY})
        loss_out = sess.run(model.loss, feed_dict={model.X: trainX, model.Y: trainY})
        if (e+1) % 100 == 1:
            print 'epoch = ' + str(e+1) + '/' + str(n_epochs) + ', loss = ' + str(loss_out)
        if (e+1) % 5000 == 1:
            print 'epoch = ' + str(e+1) + '/' + str(n_epochs) + ', loss = ' + str(loss_out)
    saver.save(sess,'/tmp/test-model')
    result = sess.run(model.model, feed_dict={model.X: trainX, model.Y: trainY})
    print result.T
###### end model training #########################################################

###### example restoring model ####################################################
tf.reset_default_graph()
with tf.Session() as sess:
    new_saver = tf.train.import_meta_graph('/tmp/test-model.meta')
    new_saver.restore(sess,tf.train.latest_checkpoint('/tmp/'))
    graph = tf.get_default_graph()
    X = graph.get_tensor_by_name("X:0")
    Y = graph.get_tensor_by_name("Y:0")
    model = graph.get_tensor_by_name('dense/BiasAdd:0')
    result = sess.run(model, feed_dict={X: trainX, Y: trainY})
    print result.T

Installing Tensorflow 1.3 / CUDA Toolkit 8.0 / cuDNN 6.0 on ASUS GL502VS-DS71 Laptop with Ubuntu 16.04 and Nvidia 1070

Image result for gl502vs-ds71

My ASUS GL502VS-DS71 laptop had some operating system problems recently so I decided to reinstall Ubuntu on it.  Somehow the Nvidia driver got updated to a more recent version (384.90) that didn't work with Tensorflow.  This is an excellent opportunity to refresh the installation procedure of Tensorflow in the blog.  If I installed the CUDA Toolkit 8.0 with the included Nvidia driver, it would not let me login to Ubuntu.  It kept returning to the login screen.  This is probably because the driver that comes with the CUDA Toolkit is too old to support the Nvidia 1070 card built into my laptop.  Therefore, we need to install a Nvidia driver that works with the 1070 card first, and then, install the CUDA Toolkit 8.0 without the included Nvidia driver.  The instructions are below:

1. Reboot computer and get into BIOS (delete/DEL key while restarting or other key)

2. If your motherboard has Secure Boot, turn it off/disable.  Save BIOS changes and reboot.

3. Install Ubuntu 16.04.x

4. After installation, open terminal.

5. sudo add-apt-repository ppa:graphics-drivers/ppa

6. sudo apt-get update

7. sudo apt-get upgrade

8. Open Software & Updates from launcher.

9. Select the Additional Drivers tab.

10. Select the Using NVIDIA binary driver - version 378.xx (I have 378.13)

11. Reboot computer.

12. Download NVIDIA CUDA Toolkit 8.0 (use .runfile only and you may have to search for this version in their archives since the newer version won't work with tensorflow) and NVIDIA cuDNN library 6.0 (again not the newest version 7.0) from NVIDIA's website.  You have to login for cuDNN libraries.  

13. Follow instructions on NVIDIA website to install cuda toolkit (and patch if available) but do not upgrade NVIDIA driver (you will be asked during .runfile execution) or change default install directories

14. Follow instructions on NVIDIA website to install cuDNN (i put it in my home directory in a folder called cuda)

15. Edit ~/.bashrc and add the following lines at the end of the file:
export LD_LIBRARY_PATH=~/cuda/lib64/:/usr/local/cuda-8.0/lib64/:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-8.0/bin:$PATH

16. Logout of Ubuntu and log back in

17. Open terminal.

17. sudo apt-get install libblas-dev liblapack-dev libjpeg-dev python-dev

18a. For python 2.7:
sudo apt-get install python-pip
sudo apt-get install idle
sudo pip install --upgrade pip
sudo pip install tensorflow-gpu (if you have a GPU)

19b. For python 3.5:
sudo apt-get install python3-pip
sudo apt-get install idle3
sudo pip3 install --upgrade pip
sudo pip3 install tensorflow-gpu (if you have a GPU)

20. Reboot

Installing Jetpack 3.1 on Jetson TX2 with Tensorflow 1.3 / Keras / hdf5 (.h5)

1. Download Jetpack 3.1 runfile from here on Host PC.

2. Run the installer "./JetPack-L4T-3.1-linux-x64.run" from terminal (you may need to navigate directory to location where downloaded).

3. Follow prompts and instructions to install.  The download may take a while and after it has completed, the instructions in a terminal will tell you to connect the Jetson to the Host PC in force recovery mode.  Make sure you connect an ethernet cable to the Jetson that is connected to the same router as the Host PC.

4. After Jetson has completed install, connect the Jetson HDMI to a monitor and a keyboard and mouse through the USB port (using a USB port expander dongle).  

5. Open a terminal in Ubuntu on the Jetson and enter the following commands (some may take a while to compile just be patient!).

6. sudo apt update

7. sudo apt-get install libblas-dev liblapack-dev python-dev idle nano python-pip

8. sudo pip install --upgrade pip

9. sudo pip install numpy

10. Download the wheel file provided by Peter Lee, https://github.com/peterlee0127/tensorflow-tx2

11. Install the wheel file by navigating to the directory of the downloaded wheel file and typing "sudo pip instaltensorflow-1.3.0-cp27-cp27mu-linux_aarch64.whl"

12. If you need .h5 file load and save, install "sudo apt-get install libhdf5-dev", and then install, "sudo pip install h5py"

13. If you need Keras, install "sudo pip install keras"

View older posts »

Search

Comments

There are currently no blog comments.