当前位置:   article > 正文

【spark】spark常用命令列表_spark创建列表命令

spark创建列表命令

:告诉spark-shell hadoop配置文件路径

#用YARN_CONF_DIR或HADOOP_CONF_DIR指定YARN或者Hadoop配置文件存放目录

set HADOOP_HOME=D:\Big-File\Architecture\hadoop\hadoop-2.3.0
set HADOOP_CONF_DIR=D:\Big-File\Architecture\hadoop\hadoop-2.3.0\etc\hadoop

:启动spark-shell时,指定需要加载的类库

bin\spark-shell  --jars   E:\DM\XXXXXXX-1.0.0.jar

:指定driver内存

bin\spark-shell --driver-memory 512m    --verbose

:spark ui地址

http://192.168.1.5:4040/jobs/

:通过spark-submit运行某个应用

bin\spark-submit --master local[4]  --class com.test.mllib.XXXXXX  E:\DM\XXXXX-1.0.0.jar  2 3 7 10 1300 1307

:java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------

https://issues.apache.org/jira/browse/SPARK-10528

解决方案(适合spark1.6.0,适合spark1.5.2):

1. Open Command Prompt in Admin Mode

2.创建目录d:/tmp/hive
2. winutils.exe chmod 777 /tmp/hive
3. Open Spark-Shell --master local[2]

:日志配置conf/log4j.properties

log4j.rootCategory=INFO, console,FILE

log4j.appender.FILE=org.apache.log4j.DailyRollingFileAppender
log4j.appender.FILE.Threshold=DEBUG
log4j.appender.FILE.file=E:/DM/Spark/spark-1.6.0-bin-hadoop2.6/spark.log
log4j.appender.FILE.DatePattern='.'yyyy-MM-dd
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
log4j.appender.FILE.layout.ConversionPattern=[%-5p] [%d{yyyy-MM-dd HH:mm:ss}] [%C{1}:%M:%L] %m%n

:java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V

使用spark1.6-hadoop2.6访问hadoopp2.6报错,改为spark1.6-hadoop2.3

:"main"java.lang.UnsatisfiedLinkError:org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z

直接修改hadoop源码,返回true

:测试访问hadoop

val textFile = sc.textFile("hdfs://localhost:19000/README.txt")
textFile.count

:测试提交任务到Yarn上执行

export HADOOP_CONF_DIR=D:\Big-File\Architecture\hadoop\hadoop-2.3.0\etc\hadoop

bin\spark-submit --class com.test.mllib.test.WorkCountApp --master yarn  --deploy-mode client  --executor-memory 256M  --num-executors 1 E:\DM\code\projects\ch11-testit\target\ch11-testit-1.0.0.jar hdfs://localhost:19000/README.txt
bin\spark-submit --class org.apache.spark.examples.SparkPi --master yarn  --deploy-mode client  --executor-memory 128M  --num-executors 1  E:\DM\Spark\spark-1.6.0-bin-hadoop2.3\lib\spark-examples-1.6.0-hadoop2.3.0.jar   10

:spark assembly jar缓存,避免每次重新提交

http://blog.csdn.net/amber_amber/article/details/42081045

: Exit status: 1. Diagnostics: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: 

http://zy19982004.iteye.com/blog/2031172

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小舞很执着/article/detail/757118
推荐阅读
相关标签
  

闽ICP备14008679号