sbt run

cd your-project-pwd

sbt ‘run-main your-main’ > out.txt

cron table

*6**** /usr/batch/test1

每天6点执行

test1脚本中即可添加sbt run的内容。

栗子：

#!/bin/sh
#
# ---------------------------------------------------------------------
# Auto parser install script.
# ---------------------------------------------------------------------
#


echo 'export RDB_PARSER_PROJECT_PATH=your-project-path' >> /etc/profile
. /etc/profile
echo ' * 1 * * * root $PWD/auto_parser_ff' >> /etc/crontab
echo ' * 7 * * * root $PWD/auto_parser_tf' >> /etc/crontab
echo ' * 14 * * * root $PWD/auto_parser_mf' >> /etc/crontab

1

2

3

4

5

6

7

8

9

10

11

12

13

#!/bin/sh

#

# ---------------------------------------------------------------------

# Auto parser install script.

# ---------------------------------------------------------------------

#

echo 'export RDB_PARSER_PROJECT_PATH=your-project-path' >> /etc/profile

. /etc/profile

echo ' * 1 * * * root $PWD/auto_parser_ff' >> /etc/crontab

echo ' * 7 * * * root $PWD/auto_parser_tf' >> /etc/crontab

echo ' * 14 * * * root $PWD/auto_parser_mf' >> /etc/crontab

香蕉与打火机 2015年11月13日 scala, 编程语言 0 Read more >

使用rpm升级系统软件

参考链接（鸟哥）：http://linux.vbird.org/linux_basic/0520rpm_and_srpm.php#rpmmanager_update

使用 RPM 來升級真是太簡單了！就以 -Uvh 或 -Fvh 來升級即可，而 -Uvh 與 -Fvh 可以用的選項與參數，跟 install 是一樣的。不過， -U 與 -F 的意義還是不太一樣的，基本的差別是這樣的：

-Uvh	後面接的軟體即使沒有安裝過，則系統將予以直接安裝；若後面接的軟體有安裝過舊版，則系統自動更新至新版；
-Fvh	如果後面接的軟體並未安裝到你的 Linux 系統上，則該軟體不會被安裝；亦即只有已安裝至你 Linux 系統內的軟體會被『升級』！

删除软件：rpm -e

遇到有依赖无法删除，建议不要强行删除。检查yum为上策。

香蕉与打火机 2015年10月22日 Linux 0 Read more >

Java并发编程：线程池的使用 executor.shutdown() executor.awaitTermination(7, TimeUnit.DAYS)

参考链接：http://www.cnblogs.com/dolphin0520/p/3932921.html

几个理解的key

线程池执行实现了Runnable的类，其实是反射调用该类实现的run()函数
线程池加载所有任务后，可以关闭（不能再添加新的线程任务），等待所有线程执行完后退出。使用以下代码实现

Scala

executor.shutdown() executor.awaitTermination(7, TimeUnit.DAYS)

1
2

executor.shutdown()
executor.awaitTermination(7, TimeUnit.DAYS)

香蕉与打火机 2015年10月20日 Linux 0 Read more >

ulimit配置修改

修改配置文件：/etc/security/limits.conf

* soft   nofile   32768
* hard nofile 65536

1 2	* soft nofile 32768 * hard nofile 65536

星号代表全局，针对某个用户的话就把星号改为某用户

weblogic      soft    nproc   2048
weblogic      hard    nproc   16384
weblogic      soft    nofile  8192
weblogic      hard    nofile  65536

1

2

3

4

weblogic soft nproc 2048

weblogic hard nproc 16384

weblogic soft nofile 8192

weblogic hard nofile 65536

全部说明：

#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - an user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open files
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit (KB)
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to values: [-20, 19]
#        - rtprio - max realtime priority
#
#<domain>      <type>  <item>         <value>
#

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

#Each line describes a limit for a user in the form:

#

#<domain> <type> <item> <value>

#

#Where:

#<domain> can be:

# - an user name

# - a group name, with @group syntax

# - the wildcard *, for default entry

# - the wildcard %, can be also used with %group syntax,

# for maxlogin limit

#

#<type> can have the two values:

# - "soft" for enforcing the soft limits

# - "hard" for enforcing hard limits

#

#<item> can be one of the following:

# - core - limits the core file size (KB)

# - data - max data size (KB)

# - fsize - maximum filesize (KB)

# - memlock - max locked-in-memory address space (KB)

# - nofile - max number of open files

# - rss - max resident set size (KB)

# - stack - max stack size (KB)

# - cpu - max CPU time (MIN)

# - nproc - max number of processes

# - as - address space limit (KB)

# - maxlogins - max number of logins for this user

# - maxsyslogins - max number of logins on the system

# - priority - the priority to run user process with

# - locks - max number of file locks the user can hold

# - sigpending - max number of pending signals

# - msgqueue - max memory used by POSIX message queues (bytes)

# - nice - max nice priority allowed to raise to values: [-20, 19]

# - rtprio - max realtime priority

#

#<domain> <type> <item> <value>

#

香蕉与打火机 2015年10月20日 Linux 0 Read more >

mysql统计SQL语句执行时间

使用：MySQL Query Profile

打开方法

mysql> set profiling=1;

mysql> show profiles;

1

2

3

mysql> set profiling=1;

mysql> show profiles;

香蕉与打火机 2015年10月20日 database, mysql 0 Read more >

使用spark加载并读取parquet格式的文件——之使用scala版

前言

It has been a long time.

最近有需求读取并测试parquet格式的文件。目前hive、impala、spark等框架均支持parquet。

本文是采用scala接口的spark进行简单的hello world。

进入spark-shell环境

[root@hadoop01 ~]# su - spark
[spark@hadoop01 ~]$ spark-
spark-class   spark-shell   spark-sql     spark-submit  
[spark@hadoop01 ~]$ spark-shell 
15/10/10 10:24:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/10/10 10:24:12 INFO SecurityManager: Changing view acls to: spark
15/10/10 10:24:12 INFO SecurityManager: Changing modify acls to: spark
15/10/10 10:24:12 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); users with modify permissions: Set(spark)
15/10/10 10:24:12 INFO HttpServer: Starting HTTP Server
15/10/10 10:24:12 INFO Server: jetty-8.y.z-SNAPSHOT
15/10/10 10:24:12 INFO AbstractConnector: Started SocketConnector@0.0.0.0:48566
15/10/10 10:24:12 INFO Utils: Successfully started service 'HTTP class server' on port 48566.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.3.1
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_71)
Type in expressions to have them evaluated.
Type :help for more information.
15/10/10 10:24:17 INFO SparkContext: Running Spark version 1.3.1
<strong>[…A lot of spark log output…]</strong>
15/10/10 10:24:28 INFO SparkILoop: Created spark context..
Spark context available as sc.
15/10/10 10:24:29 INFO SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.

scala>

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

[root@hadoop01 ~]# su - spark

[spark@hadoop01 ~]$ spark-

spark-class spark-shell spark-sql spark-submit

[spark@hadoop01 ~]$ spark-shell

15/10/10 10:24:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

15/10/10 10:24:12 INFO SecurityManager: Changing view acls to: spark

15/10/10 10:24:12 INFO SecurityManager: Changing modify acls to: spark

15/10/10 10:24:12 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); users with modify permissions: Set(spark)

15/10/10 10:24:12 INFO HttpServer: Starting HTTP Server

15/10/10 10:24:12 INFO Server: jetty-8.y.z-SNAPSHOT

15/10/10 10:24:12 INFO AbstractConnector: Started SocketConnector@0.0.0.0:48566

15/10/10 10:24:12 INFO Utils: Successfully started service 'HTTP class server' on port 48566.

Welcome to

____ __

/ __/__ ___ _____/ /__

_\ \/ _ \/ _ `/ __/ '_/

/___/ .__/\_,_/_/ /_/\_\ version 1.3.1

/_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_71)

Type in expressions to have them evaluated.

Type :help for more information.

15/10/10 10:24:17 INFO SparkContext: Running Spark version 1.3.1

<strong>[…A lot of spark log output…]</strong>

15/10/10 10:24:28 INFO SparkILoop: Created spark context..

Spark context available as sc.

15/10/10 10:24:29 INFO SparkILoop: Created sql context (with Hive support)..

SQL context available as sqlContext.

scala>

注意：

输出可以看到这里用的是spark1.3.1
spark-shell自动为你注册了Spark context，该对象名字为：sc。后面直接使用sc对象进行设置。
看到scala提示符，就意味着可以进行编程测试了。

注册SQLContext并进行配置

scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@1530f74e

scala> sqlContext.setConf("spark.sql.parquet.binaryAsString","true")

scala>

1

2

3

4

5

6

scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)

sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@1530f74e

scala> sqlContext.setConf("spark.sql.parquet.binaryAsString","true")

scala>

步骤：

使用sc（Spark context）对象注册SQLContext
使用setConf对SQLContext进行配置
可配置的参数见：http://spark.apache.org/docs/1.3.1/sql-programming-guide.html#parquet-files
可配置的参数在1.5.1的最新版本中增加了很多

导入parquet文件

导入文件：

scala> val parquetFile = sqlContext.parquetFile("/tmp/me.parquet")
15/10/10 10:47:12 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
parquetFile: org.apache.spark.sql.DataFrame = [CONTRACTID: string, TDATETIME: string, CONTRACTNAME: string, LASTPX: double, HIGHPX: double, LOWPX: double, CQ: double, TQ: double, LASTQTY: double, INITOPENINTS: double, OPENINTS: double, INTSCHG: double, TURNOVER: double, RISELIMIT: double, FALLLIMIT: double, PRESETTLE: double, PRECLOSE: double, OPENPX: double, CLOSEPX: double, SETTLEMENTPX: double, LIFELOW: double, LIFEHIGH: double, AVGPX: double, BIDIMPLYQTY: double, ASKIMPLYQTY: double, SIDE: string, S1: double, B1: double, SV1: double, BV1: double, S5: double, S4: double, S3: double, S2: double, B2: double, B3: double, B4: double, B5: double, SV5: double, SV4: double, SV3: double, SV2: double, BV2: double, BV3: double, BV4: double, BV5: double, PREDELTA: double, CURRDELTA: double, CHG...

1

2

3

4

5

6

scala> val parquetFile = sqlContext.parquetFile("/tmp/me.parquet")

15/10/10 10:47:12 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".

SLF4J: Defaulting to no-operation (NOP) logger implementation

SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

parquetFile: org.apache.spark.sql.DataFrame = [CONTRACTID: string, TDATETIME: string, CONTRACTNAME: string, LASTPX: double, HIGHPX: double, LOWPX: double, CQ: double, TQ: double, LASTQTY: double, INITOPENINTS: double, OPENINTS: double, INTSCHG: double, TURNOVER: double, RISELIMIT: double, FALLLIMIT: double, PRESETTLE: double, PRECLOSE: double, OPENPX: double, CLOSEPX: double, SETTLEMENTPX: double, LIFELOW: double, LIFEHIGH: double, AVGPX: double, BIDIMPLYQTY: double, ASKIMPLYQTY: double, SIDE: string, S1: double, B1: double, SV1: double, BV1: double, S5: double, S4: double, S3: double, S2: double, B2: double, B3: double, B4: double, B5: double, SV5: double, SV4: double, SV3: double, SV2: double, BV2: double, BV3: double, BV4: double, BV5: double, PREDELTA: double, CURRDELTA: double, CHG...

打印parquet文件的schema：

scala> parquetFile.printSchema()
root
 |-- CONTRACTID: string (nullable = false)
 |-- TDATETIME: string (nullable = false)

1

2

3

4

scala> parquetFile.printSchema()

root

|-- CONTRACTID: string (nullable = false)

|-- TDATETIME: string (nullable = false)

注意：如果sqlContext.setConf(“spark.sql.parquet.binaryAsString”,”false”)，则列数据类型将为原始的binary。这里自动进行了转换。

将parquet文件注册为临时表

scala> parquetFile.registerTempTable("parquetFile")

1	scala> parquetFile.registerTempTable("parquetFile")

parquet表的DML操作

执行sql：

scala> val tdatetime = sqlContext.sql("SELECT TDATETIME FROM parquetFile")
tdatetime: org.apache.spark.sql.DataFrame = [TDATETIME: string]

1 2	scala> val tdatetime = sqlContext.sql("SELECT TDATETIME FROM parquetFile") tdatetime: org.apache.spark.sql.DataFrame = [TDATETIME: string]

遍历结果：

scala> contractid.map(t => "TDATETIME: " + t(0)).collect().foreach(println)
15/10/10 10:17:56 INFO MemoryStore: ensureFreeSpace(223942) called with curMem=276682, maxMem=15558896517
[…A lot of spark log output…]
15/10/10 10:17:56 INFO DAGScheduler: Stage 1 (collect at <console>:26) finished in 0.265 s
15/10/10 10:17:56 INFO DAGScheduler: Job 1 finished: collect at <console>:26, took 0.284977 s
TDATETIME: 2015-05-04 08:45:35.223
TDATETIME: 2015-05-04 08:46:36.067
TDATETIME: 2015-05-04 08:47:36.940
[……]

1

2

3

4

5

6

7

8

9

scala> contractid.map(t => "TDATETIME: " + t(0)).collect().foreach(println)

15/10/10 10:17:56 INFO MemoryStore: ensureFreeSpace(223942) called with curMem=276682, maxMem=15558896517

[…A lot of spark log output…]

15/10/10 10:17:56 INFO DAGScheduler: Stage 1 (collect at <console>:26) finished in 0.265 s

15/10/10 10:17:56 INFO DAGScheduler: Job 1 finished: collect at <console>:26, took 0.284977 s

TDATETIME: 2015-05-04 08:45:35.223

TDATETIME: 2015-05-04 08:46:36.067

TDATETIME: 2015-05-04 08:47:36.940

[……]

香蕉与打火机 2015年10月10日 let's spark, 小云云 0 Read more >

mysql使用存储过程生成测试数据

声明存过

# pre delete
drop procedure insert_parquet;

#declare procedure
delimiter @
create procedure insert_parquet(in item integer)
begin
declare counter int;
set counter = item;
while counter >= 1 do
insert into parquet values(counter,concat('company',counter),counter+0.1,CURTIME());
set counter = counter - 1;
end while;
end
@
delimiter ;

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

# pre delete

drop procedure insert_parquet;

#declare procedure

delimiter @

create procedure insert_parquet(in item integer)

begin

declare counter int;

set counter = item;

while counter >= 1 do

insert into parquet values(counter,concat('company',counter),counter+0.1,CURTIME());

set counter = counter - 1;

end while;

end

@

delimiter ;

测试使用

mysql> truncate table parquet;
Query OK, 0 rows affected

mysql> call insert_parquet(1000000);<br>Query OK, 1 row affected (50 min 18.30 sec)<br>

1

2

3

4

mysql> truncate table parquet;

Query OK, 0 rows affected

mysql> call insert_parquet(1000000);<br>Query OK, 1 row affected (50 min 18.30 sec)<br>

生成了100w条数据。还挺快，不到1小时完成。

香蕉与打火机 2015年8月5日 database, mysql 0 Read more >

Ubuntu下折腾Scala Intellij Idea之sbt

.ivy与.m2的目录默认在：/root/下面

香蕉与打火机 2015年8月5日 Ubuntu 0 Read more >

RedHat使用vnc远程显示桌面

安装vncserver

[root@test~]# yum install vnc-server

1	[root@test~]# yum install vnc-server

配置vnc

[root@test ~]# vim /etc/sysconfig/vncservers 

 VNCSERVERS="2:root"
 VNCSERVERARGS[2]="-geometry 800x600 -nolisten tcp -localhost"

1

2

3

4

[root@test ~]# vim /etc/sysconfig/vncservers

VNCSERVERS="2:root"

VNCSERVERARGS[2]="-geometry 800x600 -nolisten tcp -localhost"

启动vnc

[root@abmdev01 ~]# vncserver 

You will require a password to access your desktops.

Password:
Verify:
Passwords don't match - try again
Password:
Verify:

New 'abmdev01:1 (root)' desktop is abmdev01:1

Starting applications specified in /root/.vnc/xstartup
Log file is /root/.vnc/abmdev01:1.log

1

2

3

4

5

6

7

8

9

10

11

12

13

14

[root@abmdev01 ~]# vncserver

You will require a password to access your desktops.

Password:

Verify:

Passwords don't match - try again

Password:

Verify:

New 'abmdev01:1 (root)' desktop is abmdev01:1

Starting applications specified in /root/.vnc/xstartup

Log file is /root/.vnc/abmdev01:1.log

使用客户端连接

关闭vnc连接

[root@test~]# vncserver -list

TigerVNC server sessions:

X DISPLAY #     PROCESS ID
:1              6069
:2              6487
[root@test~]# vncserver -kill :1
Killing Xvnc process ID 6069
[root@test~]# vncserver -kill :2
Killing Xvnc process ID 6487

1

2

3

4

5

6

7

8

9

10

11

[root@test~]# vncserver -list

TigerVNC server sessions:

X DISPLAY # PROCESS ID

:1 6069

:2 6487

[root@test~]# vncserver -kill :1

Killing Xvnc process ID 6069

[root@test~]# vncserver -kill :2

Killing Xvnc process ID 6487

关闭vnc服务

[root@test~]# service vncserver stop

1	[root@test~]# service vncserver stop

香蕉与打火机 2015年8月4日 Linux 0 Read more >

linux 使用DISPLAY参数将图形化程序的界面投到其他机器

参考链接：http://blog.chinaunix.net/uid-23072872-id-3388906.html

目前的主要需求是安装oracle client、oracle instance（当然也可以生成脚本直接安装instance）。

一、用法

DISPLAY环境变量格式如下host:NumA.NumB, host指Xserver所在的主机主机名或者ip地址, 图形将显示在这一机器上, 可以是启动了图形界面的Linux/Unix机器, 也可以是安装了Exceed, X-Deep/32等Windows平台运行的Xserver的Windows机器. 假如Host为空, 则表示Xserver运行于本机, 并且图形程序(Xclient)使用unix socket方式连接到Xserver, 而不是TCP方式. 使用TCP方式连接时, NumA为连接的端口减去6000的值, 假如NumA为0, 则表示连接到6000端口; 使用unix socket方式连接时NumA则表示连接的unix socket的路径, 假如为0, 则表示连接到/tmp/.X11-unix/X0 . NumB则几乎总是0.

二、实践

[root@testoracle Desktop]# xhost +
access control disabled, clients can connect from any host
[root@testoracle Desktop]# su - oracle
[oracle@testoracle ~]$ export DISPLAY=本地桌面:0.0
[oracle@testoracle ~]$ dbca

1

2

3

4

5

[root@testoracle Desktop]# xhost +

access control disabled, clients can connect from any host

[root@testoracle Desktop]# su - oracle

[oracle@testoracle ~]$ export DISPLAY=本地桌面:0.0

[oracle@testoracle ~]$ dbca

dbca的图形化界面即在本地出现。

香蕉与打火机 2015年7月3日 Linux 0 Read more >

机器学习和AI的时代来了

Latest Posts

使用sbt自动运行scala程序