2015年7月30日 星期四

How to Delete Big Files From Git History

Please refer to these articles:

  1. To find that object and decide whether it’s worth deleting later on
    $ git rev-list --objects --all | sort -k 2 > allfileshas.txt
  2. Get the last object SHA for all committed files and sort them in biggest to smallest order
    $ git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt
  3. Take that result and iterate through each line of it to find the SHA, file size in bytes, and real file name (you also need the allfileshas.txt output file from above)
    #!/bin/sh
    for SHA in `cut -f 1 -d\  < bigobjects.txt`; do
    echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print $1,$3,$7}' >> bigtosmall.txt
    done;
  4. Use filter-branch to remove the file/directory (replace MY-BIG-DIRECTORY-OR-FILE with the path that you’d like to delete relative to the root of the git repo
    $ git filter-branch --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch MY-BIG-DIRECTORY-OR-FILE' --tag-name-filter cat -- --all
  5. Then clone the repo and make sure to not leave any hard links with
    $ git clone --no-hardlinks file:///Users/yourUser/your/full/repo/path repo-clone-name
  6. Change remote origin url
    $ git remote remove origin
    $ git remote add origin YOUR-PROJECT-GIT-URL
  7. To force-push your local changes to overwrite your GitHub repository, as well as all the branches you've pushed up
    $ git push origin --force --all
  8. In order to remove the sensitive file from your tagged releases, you'll also need to force-push against your Git tags
    $ git push origin --force --tags

2015年1月21日 星期三

多個Cassandra Node在不同Data Center

本實作是採用Cassandra 1.0.12來實施

Topology





Token分配
由於Cassandra 1.0.12並沒有token generator, 建議下載https://raw.github.com/riptano/ComboAMI/2.2/tokentoolv2.py, 產生Token分配表, 如下所示:


Node hostname IP Address Token Data Center Rack
node0 clouddb1.gc.net 172.16.70.32 0 DC1 RAC1
node1 clouddb2.gc.net 172.16.70.41 56713727820156410577229101238628035242 DC1 RAC1
node2 clouddb3.gc.net 172.16.70.42 113427455640312821154458202477256070485 DC1 RAC1
node3 clouddb4.gc.net 172.16.70.43 28356863910078205288614550619314017621 DC2 RAC1
node4 clouddb5.gc.net 172.16.70.44 85070591730234615865843651857942052863 DC2 RAC1
node5 clouddb6.gc.net 172.16.70.45 141784319550391026443072753096570088106 DC2 RAC1



修改cassandra.yaml
依token分配表將token和hostname填入initial_token和listen_address,例如
In node 0
initial_token: 0
listen_address: clouddb1.gc.net


In node 1
initial_token: 56713727820156410577229101238628035242
listen_address: clouddb2.gc.net
依此類推


Snitch用來設定Topology環境, 目的避免單一node failure.
其中環境可分為DataCenter和Rack, 在本文的測試環境分為DC1和DC2, 統一都是使用第一組機台

原文如下:
Set this to a class that implements
# IEndpointSnitch.  The snitch has two functions:
# - it teaches Cassandra enough about your network topology to route
#   requests efficiently
# - it allows Cassandra to spread replicas around your cluster to avoid
#   correlated failures. It does this by grouping machines into
#   "datacenters" and "racks."  Cassandra will do its best not to have
#   more than one replica on the same "rack" (which may not actually
#   be a physical location)

Cassandra提供幾種Snitch的方式

  • SimpleSnitch
  •  Treats Strategy order as proximity. This improves cache locality  when disabling read repair, which can further improve throughput.  Only appropriate for single-datacenter deployments.

  • PropertyFileSnitch
  •  Proximity is determined by rack and data center, which are  explicitly configured in cassandra-topology.properties.

  • RackInferringSnitch
  •  Proximity is determined by rack and data center, which are  assumed to correspond to the 3rd and 2nd octet of each node's  IP address, respectively.  Unless this happens to match your  deployment conventions (as it did Facebook's), this is best used  as an example of writing a custom Snitch class.

  • Ec2Snitch
  •  Appropriate for EC2 deployments in a single Region.  Loads Region  and Availability Zone information from the EC2 API. The Region is  treated as the Datacenter, and the Availability Zone as the rack.  Only private IPs are used, so this will not work across multiple  Regions.

  • Ec2MultiRegionSnitch
  •  Uses public IPs as broadcast_address to allow cross-region  connectivity.  (Thus, you should set seed addresses to the public  IP as well.) You will need to open the storage_port or  ssl_storage_port on the public IP firewall.  (For intra-Region  traffic, Cassandra will switch to the private IP after  establishing a connection.)
本文使用的endpoint snitch為PropertyFileSnitch, 其支援Run-time更新已異動的Property值
在node0~node5的conf/cassandra.yaml
endpoint_snitch: PropertyFileSnitch

且修改conf/cassandra-topology.properties, 指定data center和rack
172.16.70.32=DC1:RAC1
172.16.70.41=DC1:RAC1
172.16.70.42=DC1:RAC1

172.16.70.43=DC2:RAC1
172.16.70.44=DC2:RAC1
172.16.70.45=DC2:RAC1

default=DC1:r1



Seed Node
指定node0, node1和node2為seed node
node0~node5的conf/cassandra.yaml
seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          - seeds: "clouddb1.gc.ubicloud.net,clouddb4.gc.ubicloud.net"

開起服務
依序開啟node0~node5的Cassandra process
結果如下



================================================
後記

新增一個sample keyspace
$ ./cqlsh localhost
> CREATE KEYSPACE sample WITH strategy_class = 'NetworkTopologyStrategy' AND strategy_options:DC1 = '3' and strategy_options:DC2 = '3';

Ring的結果:
$ ./nodetool -h self ring
Address         DC          Rack        Status State   Load            Owns    Token
                                                                               169417178424467235000914166253263322299
node0  172.16.70.32    DC1         RAC1        Up     Normal  93.18 KB        0.43%   0
node4  172.16.70.44    DC2         RAC1        Up     Normal  74.67 KB        32.91%  55989722784154413846455963776007251813
node1  172.16.70.41    DC1         RAC1        Up     Normal  97.89 KB        0.43%   56713727820156410577229101238628035242
node5  172.16.70.45    DC2         RAC1        Up     Normal  81.01 KB        32.91%  112703450604310824423685065014635287055
node2  172.16.70.42    DC1         RAC1        Up     Normal  97.66 KB        0.43%   113427455640312821154458202477256070484
node3  172.16.70.43    DC2         RAC1        Up     Normal  81.01 KB        32.91%  169417178424467235000914166253263322299

$ ./nodetool -h self describering sample
TokenRange:
  TokenRange(start_token:55989722784154413846455963776007251813, end_token:56713727820156410577229101238628035242, endpoints:[172.16.70.45, 172.16.70.43, 172.16.70.44, 172.16.70.41, 172.16.70.42, 172.16.70.32], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1)])
  TokenRange(start_token:113427455640312821154458202477256070484, end_token:169417178424467235000914166253263322299, endpoints:[172.16.70.43, 172.16.70.44, 172.16.70.45, 172.16.70.32, 172.16.70.41, 172.16.70.42], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1)])
  TokenRange(start_token:169417178424467235000914166253263322299, end_token:0, endpoints:[172.16.70.44, 172.16.70.45, 172.16.70.43, 172.16.70.32, 172.16.70.41, 172.16.70.42], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1)])
  TokenRange(start_token:56713727820156410577229101238628035242, end_token:112703450604310824423685065014635287055, endpoints:[172.16.70.45, 172.16.70.43, 172.16.70.44, 172.16.70.42, 172.16.70.32, 172.16.70.41], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1)])
  TokenRange(start_token:112703450604310824423685065014635287055, end_token:113427455640312821154458202477256070484, endpoints:[172.16.70.43, 172.16.70.44, 172.16.70.45, 172.16.70.42, 172.16.70.32, 172.16.70.41], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1)])
  TokenRange(start_token:0, end_token:55989722784154413846455963776007251813, endpoints:[172.16.70.44, 172.16.70.45, 172.16.70.43, 172.16.70.41, 172.16.70.42, 172.16.70.32], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1)])

從describering的結果顯示Ring的排列方式如下:
4 -> 1 -> 5 -> 2 -> 3 -> 0 -> 4

==================================================
若改用Cassandra 2.1.x, 結果如下

$ ./nodetool -h self status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 172.16.70.32 107.14 KB 256 ? b5d8b0c5-5c4c-43ad-b456-e3a0b2dbf348 RAC1
UN 172.16.70.41 141.17 KB 256 ? 51466f2f-a986-4843-9e36-6fca697301ac RAC1
UN 172.16.70.42 141.99 KB 256 ? f7faaba2-f5dd-46a0-b272-5fed57bf1123 RAC1
Datacenter: DC2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 172.16.70.43 111.84 KB 256 ? 810b84bf-25fd-4787-b406-9973339ef77f RAC1
UN 172.16.70.44 126.95 KB 256 ? 01db41b4-b000-46b0-99f4-063e2ddda4dd RAC1
UN 172.16.70.45 141.58 KB 256 ? 910ed07e-8484-434d-bc66-3e685e4311c4 RAC1

2014年12月4日 星期四

UUID concept

http://blog.tompawlak.org/generate-unique-identifier-nodejs-javascript
Version 1 UUID is meant for generating time-based UUIDs. They also accept 48 bit long identifier (281,474,976,710,655 available values). In many cases it make sense to use a machine MAC Address as an identifier. Sometimes if we use several UUID generators on the same system, we can just use configured identifiers. It is very important to have unique identifier in a distributed environment. That will guarantee conflict-free ids.
Version 4 UUID is meant for generating UUIDs from truly-random or pseudo-random numbers. UUID v4 are not giving us guaranteed unique numbers; they are rather practically unique. Probability of getting a duplicate is as follows:
Only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%
所以Version4 UUID是有機率是重覆的

作者實測 Java 1.7 java.util.UUID 是Version 4 UUID
        UUID idOne = UUID.randomUUID();
        System.out.println("UUID One: " + idOne + ", version = " + idOne.version());
        => UUID One: 4de87030-2cd6-4547-b2fd-9280d1cb22de, version = 4

也就是說, 是有機率性是重覆的

2014年10月8日 星期三

不可在package裡, 同時使用module.exports和exports.xxx

util = require 'util'
Q = require 'q'

randomInteger = (base) ->
  Math.floor Math.random() * base

class People
  constructor: (@name) ->

  do: (telephone, callback) ->
    util.debug "==> #{@name} #{@type}.do, #{telephone}"
    deferred = Q.defer()
    timer = setTimeout =>
      console.log 'do a long study task'
      deferred.resolve telephone;
      util.debug "<== #{@name} #{@type}.do"
    , randomInteger 3000
    deferred.promise

class Student extends People
  constructor: (name) ->
    super name
    @type = 'Student'

 class Worker extends People
  constructor: (name) ->
    super name
    @type = 'Worker'

exports.Student = Student

exports.module = Worker # 錯誤寫法

====================

上述exports.modules = Worker宣告當初始該package時, 對外使用介面為Worker function

Worker = require './school'

結果將沒有任何方式可以取得Student function

所以應該要改寫成

exports.Student = Student
exports.Worker = Worker

使用方式如下:
Student = require('school').Student
Worker = require('school').Worker

student = new Student("baby")
worker = new Worker("Rick")

2014年7月31日 星期四

安裝statsD in CentOS 6.5

必要條件:
事先安裝好Node.js和CollectD

第二步是將下述command, 寫成單一的.sh檔案, 即可安裝

###
### Install StatsD ###
###

PARENT_LOCATION="/opt/nodejs"

### Download StatsD ###
cd /usr/local/src
sudo git clone https://github.com/etsy/statsd.git
sudo mv statsd $PARENT_LOCATION/
sudo cp $PARENT_LOCATION/statsd/exampleConfig.js $PARENT_LOCATION/statsd/config.js
### Install node-supervisor ###
#sudo npm install supervisor -g

### Install Supervisord ###
sudo rpm -Uhv https://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
sudo yum -y install supervisor
sudo chkconfig supervisord on
sudo chmod 600 /etc/supervisord.conf

### Configure Supervisord ###
sudo cat statsd.init >> /etc/supervisord.conf
sudo mkdir -p /var/log/nodejs
sudo rm -f statsd.init
###
### Run StatsD ###
###
sudo service supervisord restart

#############
[statsd.init]
[program:nodejs_statsd]
command = /usr/local/bin/node /opt/nodejs/statsd/stats.js /opt/nodejs/statsd/config.js
directory = /opt/nodejs/statsd
user = root
autostart = true
autorestart = true
environment = NODE_ENV="production"
logfile=/var/log/nodejs/statsd.log
logfile_maxbytes=20MB

logfile_backups=10

安裝Node.js in CentOS 6.5

將下述的command寫成一個.sh file, 再執行安裝即可,
若有Node有出新版時, 只要修改NODE_VERSION, 就可以重覆安裝

#!/bin/bash

NODE_VERSION="v0.10.30"
NODE_FILENAME="node-$NODE_VERSION-linux-x64"
PARENT_LOCATION="/opt/nodejs"

###
### Prerequisive ###
###
yum -y update
yum -y groupinstall "Development Tools"

sudo mkdir -p /usr/local/src
sudo cp etc/statsd.init /usr/local/src

###
### Install NodeJS ###
###

### Download NodeJS ###
cd /usr/local/src
sudo wget -nc http://nodejs.org/dist/$NODE_VERSION/$NODE_FILENAME.tar.gz
#wget -E -H -k -K -p http:///
sudo tar zxvf $NODE_FILENAME.tar.gz
sudo mkdir -p $PARENT_LOCATION
sudo mv $NODE_FILENAME $PARENT_LOCATION/

### Link binary files ###
rm -f /usr/local/bin/node
rm -f /usr/local/bin/npm
sudo ln -s $PARENT_LOCATION/$NODE_FILENAME/bin/node /usr/local/bin
sudo ln -s $PARENT_LOCATION/$NODE_FILENAME/bin/npm /usr/local/bin

2014年7月9日 星期三

在GlusterFS架構下安裝Clustered Samba

1. Architecture

1.1 Prerequisite and Foundation

  • CentOS 6.x
  • GlusterFS
  • CTDB
  • Samba
Terminology
縮寫 全名 說明
CIFS Common Internet File System 簡單地說, Windows的網路上的芳鄰, 網路文件共享系統(CIFS)
NFS Network File System
PV Physical Volume
VG Volume Group
LV Logical Volume
Clustered Samba

1.2 網路配置

準備兩台機器, 各有三張網路卡介面 Network digram
Add the following hostnames in /etc/hosts
# NFS/CIFS access
192.168.18.220  nas1.rickpc gluster01
192.168.18.2  nas2.rickpc gluster02

# CTDB interconnect
192.168.3.101    gluster01c
192.168.3.102    gluster02c

# GlusterFS interconnect
192.168.2.101    gluster01g
192.168.2.102    gluster02g

1.3. 建立實體硬碟

若要瞭解Linux磁碟檔案系統的基本原理和如何使用fdisk來分切磁碟可參考 NFS伺服器1介紹, 以下僅列出基本指令
Prepare phylical partition to create /dev/sdb1
$ fdisk /dev/sdb
$ partprobe
分切nas1和nas2的磁碟, 結果如下,
筆者所使用的硬碟為8G, 但只切出
/dev/sdb4 64M
/dev/sdb5 2.1G (將做為physical volume空間)
Disk /dev/sdb: 8589 MB, 8589934592 bytes
255 heads, 63 sectors/track, 1044 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x9815603c

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               9        1044     8321670    5  Extended
/dev/sdb4               1           8       64228+  83  Linux
/dev/sdb5               9         270     2104483+  83  Linux

1.4. 建立Linux Volume

若對於PV, VG, LV的概念原理想深入瞭解的話, 可參考 Logical Volume Manager2的解釋 Volume配置
Create phylical volume
$ pvcreate /dev/sdb5
Create volume group
$ vgcreate vg_bricks /dev/sdb5
Create logical volume
$ lvcreate -n lv_lock -L 64M vg_bricks
$ lvcreate -n lv_brick01 -L 1.5G vg_bricks
Install XFS package
$ yum install -y xfsprogs
format linux file system
$ mkfs.xfs -i size=512 /dev/vg_bricks/lv_lock
$ mkfs.xfs -i size=512 /dev/vg_bricks/lv_brick01
$ echo '/dev/vg_bricks/lv_lock /bricks/lock xfs defaults 0 0' >> /etc/fstab
$ echo '/dev/vg_bricks/lv_brick01 /bricks/brick01 xfs defaults 0 0' >> /etc/fstab
$ mkdir -p /bricks/lock
$ mkdir -p /bricks/brick01
$ mount /bricks/lock
$ mount /bricks/brick01
分別在nas1和nas2上建立PV, VG和LV, 結果如下:
[root@nas1 ~]# lvdisplay
  --- Logical volume ---
  LV Path                /dev/vg_bricks/lv_lock
  LV Name                lv_lock
  VG Name                vg_bricks
  LV UUID                rnRNbZ-QFun-pxvS-AS3f-pvn3-dvCY-h3qXgi
  LV Write Access        read/write
  LV Creation host, time nas1.rickpc, 2014-07-04 16:54:20 +0800
  LV Status              available
  # open                 1
  LV Size                64.00 MiB
  Current LE             16
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2

  --- Logical volume ---
  LV Path                /dev/vg_bricks/lv_brick01
  LV Name                lv_brick01
  VG Name                vg_bricks
  LV UUID                BwMD2T-YOJi-spM4-aarC-3Yyj-Jfe2-nsecIJ
  LV Write Access        read/write
  LV Creation host, time nas1.rickpc, 2014-07-04 16:56:11 +0800
  LV Status              available
  # open                 1
  LV Size                1.50 GiB
  Current LE             384
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:3

1.5. 安裝GlusterFS and create volumes

想瞭解CTDB與GlusterFS之間是如何運作以及如何安裝GlusterFS和CTDB, 可參考 GlusterFS/CTDB Integration3 和 Clustered NAS For Everyone Clustering Samba With CTDB4.
Install GlusterFS packages on all nodes
$ wget -nc http://download.gluster.org/pub/gluster/glusterfs/3.5/LATEST/RHEL/glusterfs-epel.repo -O /etc/yum.repos.d/glusterfs-epel.repo
$ yum install -y rpcbind glusterfs-server
$ chkconfig rpcbind on
$ service rpcbind restart
$ service glusterd restart
Do not auto start glusterd with chkconfig.
Configure cluster and create volumes from gluster01
將 gluster02g 加入可信任的儲存池 (Trusted Stroage Pool)
$ gluster peer probe gluster02g
若遇到 gluster peer probe: failed: Probe returned with unknown errno 107, 請參考5
確認信任關係
gluster peer status
建立 Volume: 在 glusterfs 的架構中,每一個 volume 就代表了單獨的虛擬檔案系統。
# transport tcp
$ gluster volume create lockvol replica 2 gluster01g:/bricks/lock gluster02g:/bricks/lock force
$ gluster volume create vol01 replica 2 gluster01g:/bricks/brick01 gluster02g:/bricks/brick01 force
$ gluster vol start lockvol
$ gluster vol start vol01
nas1和nas2分別建立了GlusterFS的虛擬檔案系統, 結果如下:
/dev/mapper/vg_bricks-lv_lock
                         60736    3576     57160   6% /bricks/lock
/dev/mapper/vg_bricks-lv_brick01
                       1562624  179536   1383088  12% /bricks/brick01
localhost:/lockvol       60672    3584     57088   6% /gluster/lock
localhost:/vol01       1562624  179584   1383040  12% /gluster/vol01

1.6. Install and configure Samba/CTDB

Install Samba/CTDB packages6 on all nodes with samba-3.6.9, samba-client-3.6.9 and ctdb-1.0.114.5
$ yum install -y samba samba­client ctdb
Install NFS7 with rpcbind-0.2.0, nfs-utils-1.2.3
$ yum install -y rpcbind nfs-utils
$ chkconfig rpcbind on
$ service rpcbind start
Configure CTDB and Samba only on gluster01
$ mkdir -p /gluster/lock
$ mount -t glusterfs localhost:/lockvol /gluster/lock
Edit /gluster/lock/ctdb
CTDB_PUBLIC_ADDRESSES=/gluster/lock/public_addresses
CTDB_NODES=/etc/ctdb/nodes
# Only when using Samba. Unnecessary for NFS.
CTDB_MANAGES_SAMBA=yes
# some tunables
CTDB_SET_DeterministicIPs=1
CTDB_SET_RecoveryBanPeriod=120
CTDB_SET_KeepaliveInterval=5
CTDB_SET_KeepaliveLimit=5
CTDB_SET_MonitorInterval=15
Edit /gluster/lock/nodes
192.168.3.101
192.168.3.102
Edit /gluster/lock/public_addresses
192.168.18.201/24 eth0
192.168.18.202/24 eth0
Edit /gluster/lock/smb.conf
[global]
    workgroup = MYGROUP
    server string = Samba Server Version %v
    clustering = yes
    security = user
    passdb backend = tdbsam
[share]
    comment = Shared Directories
    path = /gluster/vol01
    browseable = yes
    writable = yes
Create symlink to config files on all nodes
$ mv /etc/sysconfig/ctdb /etc/sysconfig/ctdb.orig
$ mv /etc/samba/smb.conf /etc/samba/smb.conf.orig
$ ln -s /gluster/lock/ctdb /etc/sysconfig/ctdb
$ ln -s /gluster/lock/nodes /etc/ctdb/nodes
$ ln -s /gluster/lock/public_addresses /etc/ctdb/public_addresses
$ ln -s /gluster/lock/smb.conf /etc/samba/smb.conf
Set SELinux permissive for smbd_t on all nodes due to the non-standard smb.conf location
$ yum install -y policycoreutils-python
$ semanage permissive -a smbd_t
We'd better set an appropriate seculity context, but there's an open issue for using chcon with GlusterFS.
Create the following script for start/stop services in /usr/local/bin/ctdb_manage
#!/bin/sh
function runcmd {
        echo exec on all nodes: $@
        ssh gluster01 $@ &
        ssh gluster02 $@ &
        wait
}
case $1 in
    start)
        runcmd service glusterd start
        sleep 1
        runcmd mkdir -p /gluster/lock
        runcmd mount  -t glusterfs localhost:/lockvol /gluster/lock
        runcmd mkdir -p /gluster/vol01
        runcmd mount  -t glusterfs localhost:/vol01 /gluster/vol01
        runcmd service ctdb start
        ;;

    stop)
        runcmd service ctdb stop
        runcmd umount /gluster/lock
        runcmd umount /gluster/vol01
        runcmd service glusterd stop
        runcmd pkill glusterfs
        ;;
esac

1.7. Start services

Set samba password and check shared directories via one of floating IP's.
$ pdbedit -a -u root
Test samba connection
$ smbclient -L 192.168.18.201 -U root
$ smbclient -L 192.168.18.202 -U root
Check Windows connection
$ ssh gluster01 netstat -aT | grep microsoft

2. Testing your clustered Samba

2.1. Client Disconnection

在一台Windows的PC上, 設定Z槽的網路磁碟機, 並執行下述的run_client.bat
echo off
:LOOP
 echo "%time% (^_-) Writing on file in the shared folder...."
 echo %time% >> z:/wintest.txt
 sleep 2

 echo "%time% (-_^) Writing on file in the shared folder...."
 echo %time% >> z:/wintest.txt
 sleep 2
每兩秒會將目前的timestamp寫入Z:/wintest.txt中, 測試步驟如下:
  1. 執行run_client.bat
  2. 將Windows上的網路卡介面關閉, 程式無法把資料寫入cluster file system
  3. 重新啟動網路卡介面, 程式又在很短時間內寫入cluster file system

2.2. CTDB Failover

使用ctdb status和ctdb ip查看目前cluster file system的狀態 測試步驟:
  1. 在Windows PC上執行run_client.bat
  2. 在任一台Cluster node上, 關閉ctdb, 指令如下:
  3. [root@nas2 ~]# ctdb stop
  4. 觀察PC上的timestamp正常寫入cluster file system

2.3. Cluster Node Crash

將一台Cluster node reboot, 觀察Windows PC上的連線狀況 測試步驟:
  1. 在Windows PC上執行run_client.bat
  2. 將任一台Cluster node OS shutdown
  3. 觀察PC上的timestamp的變化
    "12:16:49.59 (-_^) Writing on file in the shared folder...."
    "12:16:51.62 (^_-) Writing on file in the shared folder...."
    "12:16:53.66 (-_^) Writing on file in the shared folder...."
    "12:16:55.70 (^_-) Writing on file in the shared folder...."
    "12:16:57.74 (-_^) Writing on file in the shared folder...."
    "12:17:41.90 (^_-) Writing on file in the shared folder...."
    "12:17:43.92 (-_^) Writing on file in the shared folder...."
    "12:17:45.95 (^_-) Writing on file in the shared folder...."
    "12:17:48.00 (-_^) Writing on file in the shared folder...."
"12:16:57.74 (-_^) Writing on file in the shared folder...."
"12:17:41.90 (^_-) Writing on file in the shared folder...."
紅色兩行的結果, 發現Winodws的連線會有數秒的中斷, 但在數秒後, PC上的test program將重新連上, 符合HA-level recovery

2.4. Ping_pong for CTDB lock rate

Ping_pong8是Samba open source所提供的一個小工具, 用來測量CTDB的lock rate
筆者稍微修改原程式碼, 並加入了將lock rate寫入到Graphite9, 方便長時間觀察lock rate的變化
ping_pong.socket.c
source code

3. Reference


  1. Linux 磁碟與檔案系統管理, 鳥哥
  2. 邏輯捲軸管理員 (Logical Volume Manager), 鳥哥
  3. GlusterFS/CTDB Integration, Etsuji Nakai
  4. Clustered NAS For Everyone Clustering Samba With CTDB, Michael Adam
  5. gluster peer probe: failed: Probe returned with unknown errno 107, Network Administrator Blog
  6. SAMBA 伺服器, 鳥哥
  7. NFS 伺服器, 鳥哥
  8. Ping pong, Samba
  9. Graphite - Scalable Realtime Graphing