`

spark-run apps on yarn mode

 
阅读更多

  run on a yarn ensemble is straightforward,

  1.setup HADOOP_CONF_DIR

   u can use command export HADOOP_CONF_DIR=xx

   or add it to spark-env.sh

   2.

spark-submit  --master yarn --class org.apache.spark.examples.JavaWordCount --verbose --deploy-mode client ~/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0-my.jar RELEASE 2

  u can goto yarn master ui to check the app info.

  also,if u wanna specify # of executors(containers?) ,just add this property in the command above

 --num-executors 2

 

--AppMaster logs.

hadoop    2758 13206  0 16:52 ?        00:00:00 /bin/bash -c /usr/local/jdk/jdk1.6.0_31/bin/java -server -Xmx512m -Djava.io.tmpdir=/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/hadoop/appcache/application_1441038159113_0029/container_1441038159113_0029_01_000001/tmp '-Dspark.eventLog.enabled=true' '-Dspark.externalBlockStore.folderName=spark-a5761a0d-2f87-4afc-b4eb-dbaf1fd86ef4' '-Dspark.executor.memory=2g' '-Dspark.jars=file:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0.jar' '-Dspark.master.ui.port=7102' '-Dspark.ui.port=7108' '-Dspark.worker.ui.port=7105' '-Dspark.driver.appUIAddress=http://192.168.100.4:7108' '-Dspark.master=yarn-client' '-Dspark.driver.allowMultipleContexts=true' '-Dspark.driver.port=52394' '-Dspark.eventLog.dir=hdfs://hd02:8020/user/hadoop/spark-eventlog' '-Dspark.executor.id=driver' '-Dspark.executor.extraJavaOptions=-Xloggc:~/spark-executor.gc -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=2 -XX:CMSInitiatingOccupancyFraction=65 -XX:+UseCMSInitiatingOccupancyOnly -XX:PermSize=64m -XX:MaxPermSize=256m -XX:NewRatio=5 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:ParallelGCThreads=5' '-Dspark.executor.cores=2' '-Dspark.driver.host=192.168.100.4' '-Dspark.driver.memory=6g' '-Dspark.storage.memoryFraction=0.5' '-Dspark.app.name=JavaWordCount' '-Dspark.fileserver.uri=http://192.168.100.4:48227' '-Dspark.cores.max=50' -Dspark.yarn.app.container.log.dir=/usr/local/hadoop/hadoop-2.5.2/logs/userlogs/application_1441038159113_0029/container_1441038159113_0029_01_000001 org.apache.spark.deploy.yarn.ExecutorLauncher --arg '192.168.100.4:52394' --executor-memory 2048m --executor-cores 2 --num-executors  2 1> /usr/local/hadoop/hadoop-2.5.2/logs/userlogs/application_1441038159113_0029/container_1441038159113_0029_01_000001/stdout 2> /usr/local/hadoop/hadoop-2.5.2/logs/userlogs/application_1441038159113_0029/container_1441038159113_0029_01_000001/stderr

hadoop    2763  2758 23 16:52 ?        00:00:06 /usr/local/jdk/jdk1.6.0_31/bin/java -server -Xmx512m -Djava.io.tmpdir=/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/hadoop/appcache/application_1441038159113_0029/container_1441038159113_0029_01_000001/tmp -Dspark.eventLog.enabled=true -Dspark.externalBlockStore.folderName=spark-a5761a0d-2f87-4afc-b4eb-dbaf1fd86ef4 -Dspark.executor.memory=2g -Dspark.jars=file:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0.jar -Dspark.master.ui.port=7102 -Dspark.ui.port=7108 -Dspark.worker.ui.port=7105 -Dspark.driver.appUIAddress=http://192.168.100.4:7108 -Dspark.master=yarn-client -Dspark.driver.allowMultipleContexts=true -Dspark.driver.port=52394 -Dspark.eventLog.dir=hdfs://hd02:8020/user/hadoop/spark-eventlog -Dspark.executor.id=driver -Dspark.executor.extraJavaOptions=-Xloggc:~/spark-executor.gc -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=2 -XX:CMSInitiatingOccupancyFraction=65 -XX:+UseCMSInitiatingOccupancyOnly -XX:PermSize=64m -XX:MaxPermSize=256m -XX:NewRatio=5 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:ParallelGCThreads=5 -Dspark.executor.cores=2 -Dspark.driver.host=192.168.100.4 -Dspark.driver.memory=6g -Dspark.storage.memoryFraction=0.5 -Dspark.app.name=JavaWordCount -Dspark.fileserver.uri=http://192.168.100.4:48227 -Dspark.cores.max=50 -Dspark.yarn.app.container.log.dir=/usr/local/hadoop/hadoop-2.5.2/logs/userlogs/application_1441038159113_0029/container_1441038159113_0029_01_000001 org.apache.spark.deploy.yarn.ExecutorLauncher --arg 192.168.100.4:52394 --executor-memory 2048m --executor-cores 2 --num-executors 2

  --task container logs

hadoop   10382  1055  0 17:20 ?        00:00:00 /bin/bash -c /usr/local/jdk/jdk1.6.0_31/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms2048m -Xmx2048m '-Xloggc:~/spark-executor.gc' '-XX:+UseCMSCompactAtFullCollection' '-XX:CMSFullGCsBeforeCompaction=2' '-XX:CMSInitiatingOccupancyFraction=65' '-XX:+UseCMSInitiatingOccupancyOnly' '-XX:PermSize=64m' '-XX:MaxPermSize=256m' '-XX:NewRatio=5' '-XX:+UseParNewGC' '-XX:+UseConcMarkSweepGC' '-XX:+PrintGCDateStamps' '-XX:+PrintGCDetails' '-XX:ParallelGCThreads=5' -Djava.io.tmpdir=/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/hadoop/appcache/application_1441038159113_0031/container_1441038159113_0031_01_000003/tmp '-Dspark.master.ui.port=7102' '-Dspark.ui.port=7108' '-Dspark.worker.ui.port=7105' '-Dspark.driver.port=44382' -Dspark.yarn.app.container.log.dir=/usr/local/hadoop/hadoop-2.5.2/logs/userlogs/application_1441038159113_0031/container_1441038159113_0031_01_000003 org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url akka.tcp://sparkDriver@192.168.100.4:44382/user/CoarseGrainedScheduler --executor-id 2 --hostname gzsw-13 --cores 2 --app-id application_1441038159113_0031 --user-class-path file:/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/hadoop/appcache/application_1441038159113_0031/container_1441038159113_0031_01_000003/__app__.jar 1> /usr/local/hadoop/hadoop-2.5.2/logs/userlogs/application_1441038159113_0031/container_1441038159113_0031_01_000003/stdout 2> /usr/local/hadoop/hadoop-2.5.2/logs/userlogs/application_1441038159113_0031/container_1441038159113_0031_01_000003/stderr
hadoop   10386 10382 99 17:20 ?        00:00:25 /usr/local/jdk/jdk1.6.0_31/bin/java -server -XX:OnOutOfMemoryError=kill %p -Xms2048m -Xmx2048m -Xloggc:~/spark-executor.gc -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=2 -XX:CMSInitiatingOccupancyFraction=65 -XX:+UseCMSInitiatingOccupancyOnly -XX:PermSize=64m -XX:MaxPermSize=256m -XX:NewRatio=5 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:ParallelGCThreads=5 -Djava.io.tmpdir=/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/hadoop/appcache/application_1441038159113_0031/container_1441038159113_0031_01_000003/tmp -Dspark.master.ui.port=7102 -Dspark.ui.port=7108 -Dspark.worker.ui.port=7105 -Dspark.driver.port=44382 -Dspark.yarn.app.container.log.dir=/usr/local/hadoop/hadoop-2.5.2/logs/userlogs/application_1441038159113_0031/container_1441038159113_0031_01_000003 org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url akka.tcp://sparkDriver@192.168.100.4:44382/user/CoarseGrainedScheduler --executor-id 2 --hostname gzsw-13 --cores 2 --app-id application_1441038159113_0031 --user-class-path file:/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/hadoop/appcache/application_1441038159113_0031/container_1441038159113_0031_01_000003/__app__.jar

 

  corresponding figures:



 

 

   the logs from driver :(u will two tasks are run on host-05 on first stage; one for each of both host-05,06 for second stage)

hadoop@GZsw04:~/spark/spark-1.4.1-bin-hadoop2.4$ spark-submit  --master yarn --class org.apache.spark.examples.JavaWordCount --verbose --deploy-mode client ~/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0-my.jar RELEASE 2
Using properties file: /home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/conf/spark-defaults.conf
Adding default property: spark.executor.extraJavaOptions=-Xloggc:/home/hadoop/spark-executor.gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.ui.port=7106
Adding default property: spark.deploy.spreadOut=false
Adding default property: spark.worker.ui.port=7105
Adding default property: spark.master.ui.port=7102
Adding default property: spark.eventLog.dir=/home/hadoop/spark/spark-eventlog
Adding default property: spark.driver.allowMultipleContexts=true
Parsed arguments:
  master                  yarn
  deployMode              client
  executorMemory          null
  executorCores           null
  totalExecutorCores      null
  propertiesFile          /home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/conf/spark-defaults.conf
  driverMemory            1g
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               org.apache.spark.examples.JavaWordCount
  primaryResource         file:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0-my.jar
  name                    org.apache.spark.examples.JavaWordCount
  childArgs               [RELEASE 2]
  jars                    null
  packages                null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file /home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/conf/spark-defaults.conf:
  spark.eventLog.enabled -> true
  spark.driver.allowMultipleContexts -> true
  spark.ui.port -> 7106
  spark.executor.extraJavaOptions -> -Xloggc:/home/hadoop/spark-executor.gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails
  spark.deploy.spreadOut -> false
  spark.eventLog.dir -> /home/hadoop/spark/spark-eventlog
  spark.worker.ui.port -> 7105
  spark.master.ui.port -> 7102

    
Main class:
org.apache.spark.examples.JavaWordCount
Arguments:
RELEASE
2
System properties:
spark.driver.memory -> 1g
spark.eventLog.enabled -> true
spark.driver.allowMultipleContexts -> true
SPARK_SUBMIT -> true
spark.ui.port -> 7106
spark.executor.extraJavaOptions -> -Xloggc:/home/hadoop/spark-executor.gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails
spark.deploy.spreadOut -> false
spark.app.name -> org.apache.spark.examples.JavaWordCount
spark.jars -> file:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0-my.jar
spark.eventLog.dir -> /home/hadoop/spark/spark-eventlog
spark.master -> yarn-client
spark.worker.ui.port -> 7105
spark.master.ui.port -> 7102
Classpath elements:
file:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0-my.jar


15/11/25 16:46:55 INFO spark.SparkContext: Running Spark version 1.4.1
15/11/25 16:46:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/11/25 16:46:55 INFO spark.SecurityManager: Changing view acls to: hadoop
15/11/25 16:46:55 INFO spark.SecurityManager: Changing modify acls to: hadoop
15/11/25 16:46:55 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
15/11/25 16:46:56 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/11/25 16:46:56 INFO Remoting: Starting remoting
15/11/25 16:46:56 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.100.4:45880]
15/11/25 16:46:56 INFO util.Utils: Successfully started service 'sparkDriver' on port 45880.
15/11/25 16:46:56 INFO spark.SparkEnv: Registering MapOutputTracker
15/11/25 16:46:56 INFO spark.SparkEnv: Registering BlockManagerMaster
15/11/25 16:46:56 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-52cdfa49-3560-40bd-9540-107f059b5d95/blockmgr-a25e4dc5-b8e0-4877-ad63-b0e32880e187
15/11/25 16:46:56 INFO storage.MemoryStore: MemoryStore started with capacity 529.9 MB
15/11/25 16:46:57 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-52cdfa49-3560-40bd-9540-107f059b5d95/httpd-8b586e36-69a3-46c1-880d-5f294a643833
15/11/25 16:46:57 INFO spark.HttpServer: Starting HTTP Server
15/11/25 16:46:57 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/11/25 16:46:57 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:51033
15/11/25 16:46:57 INFO util.Utils: Successfully started service 'HTTP file server' on port 51033.
15/11/25 16:46:57 INFO spark.SparkEnv: Registering OutputCommitCoordinator
15/11/25 16:46:57 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/11/25 16:46:57 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:7106
15/11/25 16:46:57 INFO util.Utils: Successfully started service 'SparkUI' on port 7106.
15/11/25 16:46:57 INFO ui.SparkUI: Started SparkUI at http://192.168.100.4:7106
15/11/25 16:46:57 INFO spark.SparkContext: Added JAR file:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0-my.jar at http://192.168.100.4:51033/jars/spark-examples-1.4.1-hadoop2.4.0-my.jar with timestamp 1448441217526
15/11/25 16:46:57 WARN cluster.YarnClientSchedulerBackend: NOTE: SPARK_WORKER_MEMORY is deprecated. Use SPARK_EXECUTOR_MEMORY or --executor-memory through spark-submit instead.
15/11/25 16:46:57 WARN cluster.YarnClientSchedulerBackend: NOTE: SPARK_WORKER_CORES is deprecated. Use SPARK_EXECUTOR_CORES or --executor-cores through spark-submit instead.
15/11/25 16:46:57 INFO client.RMProxy: Connecting to ResourceManager at hd02/192.168.100.4:8032
15/11/25 16:46:57 INFO yarn.Client: Requesting a new application from cluster with 10 NodeManagers
15/11/25 16:46:57 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
15/11/25 16:46:57 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
15/11/25 16:46:57 INFO yarn.Client: Setting up container launch context for our AM
15/11/25 16:46:57 INFO yarn.Client: Preparing resources for our AM container
15/11/25 16:46:58 INFO yarn.Client: Uploading resource file:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-assembly-1.4.1-hadoop2.4.0.jar -> hdfs://mycluster/user/hadoop/.sparkStaging/application_1441038159113_0003/spark-assembly-1.4.1-hadoop2.4.0.jar
15/11/25 16:47:00 INFO yarn.Client: Uploading resource file:/tmp/spark-52cdfa49-3560-40bd-9540-107f059b5d95/__hadoop_conf__6446760494119929942.zip -> hdfs://mycluster/user/hadoop/.sparkStaging/application_1441038159113_0003/__hadoop_conf__6446760494119929942.zip
15/11/25 16:47:00 INFO yarn.Client: Setting up the launch environment for our AM container
15/11/25 16:47:00 INFO spark.SecurityManager: Changing view acls to: hadoop
15/11/25 16:47:00 INFO spark.SecurityManager: Changing modify acls to: hadoop
15/11/25 16:47:00 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
15/11/25 16:47:00 INFO yarn.Client: Submitting application 3 to ResourceManager
15/11/25 16:47:00 INFO impl.YarnClientImpl: Submitted application application_1441038159113_0003
15/11/25 16:47:01 INFO yarn.Client: Application report for application_1441038159113_0003 (state: ACCEPTED)
15/11/25 16:47:01 INFO yarn.Client: 
	 client token: N/A
	 diagnostics: N/A
	 ApplicationMaster host: N/A
	 ApplicationMaster RPC port: -1
	 queue: default
	 start time: 1448441220409
	 final status: UNDEFINED
	 tracking URL: http://hd02:7104/proxy/application_1441038159113_0003/
	 user: hadoop
15/11/25 16:47:02 INFO yarn.Client: Application report for application_1441038159113_0003 (state: ACCEPTED)
15/11/25 16:47:03 INFO yarn.Client: Application report for application_1441038159113_0003 (state: ACCEPTED)
15/11/25 16:47:04 INFO yarn.Client: Application report for application_1441038159113_0003 (state: ACCEPTED)
15/11/25 16:47:05 INFO yarn.Client: Application report for application_1441038159113_0003 (state: ACCEPTED)
15/11/25 16:47:06 INFO yarn.Client: Application report for application_1441038159113_0003 (state: ACCEPTED)
15/11/25 16:47:06 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM@192.168.100.14:46652/user/YarnAM#-1250321572])
15/11/25 16:47:06 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> hd02, PROXY_URI_BASES -> http://hd02:7104/proxy/application_1441038159113_0003), /proxy/application_1441038159113_0003
15/11/25 16:47:06 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/11/25 16:47:07 INFO yarn.Client: Application report for application_1441038159113_0003 (state: RUNNING)
15/11/25 16:47:07 INFO yarn.Client: 
	 client token: N/A
	 diagnostics: N/A
	 ApplicationMaster host: 192.168.100.14
	 ApplicationMaster RPC port: 0
	 queue: default
	 start time: 1448441220409
	 final status: UNDEFINED
	 tracking URL: http://hd02:7104/proxy/application_1441038159113_0003/
	 user: hadoop
15/11/25 16:47:07 INFO cluster.YarnClientSchedulerBackend: Application application_1441038159113_0003 has started running.
15/11/25 16:47:07 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52047.
15/11/25 16:47:07 INFO netty.NettyBlockTransferService: Server created on 52047
15/11/25 16:47:07 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/11/25 16:47:07 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.100.4:52047 with 529.9 MB RAM, BlockManagerId(driver, 192.168.100.4, 52047)
15/11/25 16:47:07 INFO storage.BlockManagerMaster: Registered BlockManager
15/11/25 16:47:07 INFO scheduler.EventLoggingListener: Logging events to file:/home/hadoop/spark/spark-eventlog/application_1441038159113_0003
15/11/25 16:47:17 INFO cluster.YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@gzsw-05:55796/user/Executor#-2059071929]) with ID 1
15/11/25 16:47:17 INFO storage.BlockManagerMasterEndpoint: Registering block manager gzsw-05:52897 with 2.1 GB RAM, BlockManagerId(1, gzsw-05, 52897)
15/11/25 16:47:17 INFO cluster.YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@gzsw-06:56733/user/Executor#261866940]) with ID 2
15/11/25 16:47:17 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
15/11/25 16:47:17 INFO storage.BlockManagerMasterEndpoint: Registering block manager gzsw-06:38994 with 2.1 GB RAM, BlockManagerId(2, gzsw-06, 38994)
15/11/25 16:47:17 INFO storage.MemoryStore: ensureFreeSpace(228640) called with curMem=0, maxMem=555684986
15/11/25 16:47:17 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 223.3 KB, free 529.7 MB)
15/11/25 16:47:17 INFO storage.MemoryStore: ensureFreeSpace(18166) called with curMem=228640, maxMem=555684986
15/11/25 16:47:17 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 17.7 KB, free 529.7 MB)
15/11/25 16:47:17 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.100.4:52047 (size: 17.7 KB, free: 529.9 MB)
15/11/25 16:47:17 INFO spark.SparkContext: Created broadcast 0 from textFile at JavaWordCount.java:49
15/11/25 16:47:17 INFO mapred.FileInputFormat: Total input paths to process : 1
15/11/25 16:47:17 INFO spark.SparkContext: Starting job: collect at JavaWordCount.java:72
15/11/25 16:47:17 INFO scheduler.DAGScheduler: Registering RDD 3 (mapToPair at JavaWordCount.java:58)
15/11/25 16:47:17 INFO scheduler.DAGScheduler: Got job 0 (collect at JavaWordCount.java:72) with 2 output partitions (allowLocal=false)
15/11/25 16:47:17 INFO scheduler.DAGScheduler: Final stage: ResultStage 1(collect at JavaWordCount.java:72)
15/11/25 16:47:17 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
15/11/25 16:47:17 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 0)
15/11/25 16:47:17 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[3] at mapToPair at JavaWordCount.java:58), which has no missing parents
15/11/25 16:47:18 INFO storage.MemoryStore: ensureFreeSpace(4736) called with curMem=246806, maxMem=555684986
15/11/25 16:47:18 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.6 KB, free 529.7 MB)
15/11/25 16:47:18 INFO storage.MemoryStore: ensureFreeSpace(2644) called with curMem=251542, maxMem=555684986
15/11/25 16:47:18 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.6 KB, free 529.7 MB)
15/11/25 16:47:18 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.100.4:52047 (size: 2.6 KB, free: 529.9 MB)
15/11/25 16:47:18 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:874
15/11/25 16:47:18 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[3] at mapToPair at JavaWordCount.java:58)
15/11/25 16:47:18 INFO cluster.YarnScheduler: Adding task set 0.0 with 2 tasks
15/11/25 16:47:18 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, gzsw-05, NODE_LOCAL, 1479 bytes)
15/11/25 16:47:18 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, gzsw-05, NODE_LOCAL, 1479 bytes)
15/11/25 16:47:19 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on gzsw-05:52897 (size: 2.6 KB, free: 2.1 GB)
15/11/25 16:47:19 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on gzsw-05:52897 (size: 17.7 KB, free: 2.1 GB)
15/11/25 16:47:20 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 2705 ms on gzsw-05 (1/2)
15/11/25 16:47:20 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 2725 ms on gzsw-05 (2/2)
15/11/25 16:47:20 INFO cluster.YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
15/11/25 16:47:20 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (mapToPair at JavaWordCount.java:58) finished in 2.733 s
15/11/25 16:47:20 INFO scheduler.DAGScheduler: looking for newly runnable stages
15/11/25 16:47:20 INFO scheduler.DAGScheduler: running: Set()
15/11/25 16:47:20 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 1)
15/11/25 16:47:20 INFO scheduler.DAGScheduler: failed: Set()
15/11/25 16:47:20 INFO scheduler.DAGScheduler: Missing parents for ResultStage 1: List()
15/11/25 16:47:20 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (ShuffledRDD[4] at reduceByKey at JavaWordCount.java:65), which is now runnable
15/11/25 16:47:20 INFO storage.MemoryStore: ensureFreeSpace(2408) called with curMem=254186, maxMem=555684986
15/11/25 16:47:20 INFO storage.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.4 KB, free 529.7 MB)
15/11/25 16:47:20 INFO storage.MemoryStore: ensureFreeSpace(1459) called with curMem=256594, maxMem=555684986
15/11/25 16:47:20 INFO storage.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1459.0 B, free 529.7 MB)
15/11/25 16:47:20 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.100.4:52047 (size: 1459.0 B, free: 529.9 MB)
15/11/25 16:47:20 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:874
15/11/25 16:47:20 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (ShuffledRDD[4] at reduceByKey at JavaWordCount.java:65)
15/11/25 16:47:20 INFO cluster.YarnScheduler: Adding task set 1.0 with 2 tasks
15/11/25 16:47:20 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, gzsw-06, PROCESS_LOCAL, 1246 bytes)
15/11/25 16:47:20 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, gzsw-05, PROCESS_LOCAL, 1246 bytes)
15/11/25 16:47:20 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on gzsw-05:52897 (size: 1459.0 B, free: 2.1 GB)
15/11/25 16:47:20 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to gzsw-05:55796
15/11/25 16:47:20 INFO spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 147 bytes
15/11/25 16:47:20 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 98 ms on gzsw-05 (1/2)
15/11/25 16:47:22 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on gzsw-06:38994 (size: 1459.0 B, free: 2.1 GB)
15/11/25 16:47:22 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to gzsw-06:56733
15/11/25 16:47:22 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 1748 ms on gzsw-06 (2/2)
15/11/25 16:47:22 INFO scheduler.DAGScheduler: ResultStage 1 (collect at JavaWordCount.java:72) finished in 1.749 s
15/11/25 16:47:22 INFO cluster.YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 
15/11/25 16:47:22 INFO scheduler.DAGScheduler: Job 0 finished: collect at JavaWordCount.java:72, took 4.603967 s
total items 14
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
15/11/25 16:47:22 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
15/11/25 16:47:22 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.100.4:7106
15/11/25 16:47:22 INFO scheduler.DAGScheduler: Stopping DAGScheduler
15/11/25 16:47:22 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
15/11/25 16:47:22 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
15/11/25 16:47:22 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
15/11/25 16:47:22 INFO cluster.YarnClientSchedulerBackend: Stopped
15/11/25 16:47:22 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
15/11/25 16:47:22 INFO util.Utils: path = /tmp/spark-52cdfa49-3560-40bd-9540-107f059b5d95/blockmgr-a25e4dc5-b8e0-4877-ad63-b0e32880e187, already present as root for deletion.
15/11/25 16:47:22 INFO storage.MemoryStore: MemoryStore cleared
15/11/25 16:47:22 INFO storage.BlockManager: BlockManager stopped
15/11/25 16:47:22 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
15/11/25 16:47:22 INFO spark.SparkContext: Successfully stopped SparkContext
15/11/25 16:47:22 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
15/11/25 16:47:22 INFO util.Utils: Shutdown hook called
15/11/25 16:47:22 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/11/25 16:47:22 INFO util.Utils: Deleting directory /tmp/spark-52cdfa49-3560-40bd-9540-107f059b5d95
15/11/25 16:47:22 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.

 

  • 大小: 74.5 KB
0
1
分享到:
评论

相关推荐

    asp代码ASP基于WEB个人博客网页设计(源代码+论文+答辩)

    asp代码ASP基于WEB个人博客网页设计(源代码+论文+答辩)本资源系百度网盘分享地址

    三菱PLC例程源码打包机

    三菱PLC例程源码打包机本资源系百度网盘分享地址

    asp代码ASP基于USBKEY文件加密工具-USBkey管理系统(源代码+论文)

    asp代码ASP基于USB KEY文件加密工具——USB key管理系统(源代码+论文)本资源系百度网盘分享地址

    Android开发编码规范

    该文档是《阿里巴巴Java开发手册》的规约条目的延伸信息; 其中包含了对内容的适当扩展和解释。它提供了编码和实现方式的正例,以及需要提防的雷区和错误案例的反例。该文档面向Android开发所有成员,旨在规范化代码风格和编程习惯,并提出了针对软件调优的建议。其中包括Android资源文件命名与使用、Android基本组件、UI与布局、进程、线程与消息等方面的内容

    网络安全-逆向学习路线

    红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线红队蓝军逆向学习路线

    which-2.20.tar.gz

    算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。

    asp代码ASP基于bs在线花店系统设计(源代码+论文)

    asp代码ASP基于bs在线花店系统设计(源代码+论文)本资源系百度网盘分享地址

    基于深度学习的轨道交通客流实时分析预测系统 第二版(前端).zip

    人工智能毕业设计&课程设计

    解决端口占用netstat -ano

    解决端口占用netstat -ano

    tensorflow_onmttok_ops-0.1.1-cp35-cp35m-manylinux2014_x86_64.whl

    算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。

    基于深度学习框架的图像识别:手势识别。使用到:CaffeTensorFlowCNNopenCVcpppythondesign

    人工智能-深度学习-tensorflow

    Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码)

    Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目源码 驴友社交系统 客户端+ 服务器端 (源码) Andorid项目

    JAVA的GUI实现可视化学生管理系统

    JAVA的GUI实现可视化学生管理系统

    WeRoBot-1.5.0-py3-none-any.whl

    Python库是一组预先编写的代码模块,旨在帮助开发者实现特定的编程任务,无需从零开始编写代码。这些库可以包括各种功能,如数学运算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。

    基于ssm小区物业管理系统.zip

    基于ssm小区物业管理系统.zip

    RStudio的15个经典高效快捷操作.pdf

    RStudio的15个经典高效快捷操作

    scratch 0.3 2D我的世界

    鼠标左键:破坏攻击 鼠标右键:放置方块/使用物品 W:前进连按两下W:奔跑 S:后退 A:左移 D:右移 空格键:跳跃 左Shift:潜行 Q:丢弃物品 E:打开背包【Beta1.3版本之前打开背包键为I】 F:调整可视范围【1.8以前版本可用】 T:聊天【多人游戏可用】 双击空格:上升并飞行【创造模式可用】 创造模式飞行中左Shift:下降【创造模式可用】 F1:隐藏界面 F2:截图 F3:查看游戏信息(坐标,帧数,游戏时间,游戏占用内存,等等详细信息) F3+S:游戏无声音时切换出声音 F3+F:调整可视范围【1.8以后版本可用】 F5:第三人称模式 F8:鼠标平滑移动 F11:全屏/窗口切换 1、用石剑攻击三次苦力怕,可以使苦力怕的血量降到最低(建议前期使用,因为前期材料不多,推荐此方法)。 2、用满力(拉满力)的弓攻击苦力怕(两次),此方法简单可靠,你可以在你的屋顶实现,简单高效(建议中期使用,初期的话没有弓与箭,初期建议使用方法一)。 3、用铁剑攻击,这个方法苦力怕的残血值会比方法1高一些,可能骷髅一箭射不死,你的功劳就白费了

    tensorflow_ranking-0.2.3-py2.py3-none-any.whl

    Python库是一组预先编写的代码模块,旨在帮助开发者实现特定的编程任务,无需从零开始编写代码。这些库可以包括各种功能,如数学运算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。

    wheel-0.41.0-py3-none-any.whl

    Python库是一组预先编写的代码模块,旨在帮助开发者实现特定的编程任务,无需从零开始编写代码。这些库可以包括各种功能,如数学运算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。

    tensorflow_serving_api-1.11.1-py2.py3-none-any.whl

    Python库是一组预先编写的代码模块,旨在帮助开发者实现特定的编程任务,无需从零开始编写代码。这些库可以包括各种功能,如数学运算、文件操作、数据分析和网络编程等。Python社区提供了大量的第三方库,如NumPy、Pandas和Requests,极大地丰富了Python的应用领域,从数据科学到Web开发。Python库的丰富性是Python成为最受欢迎的编程语言之一的关键原因之一。这些库不仅为初学者提供了快速入门的途径,而且为经验丰富的开发者提供了强大的工具,以高效率、高质量地完成复杂任务。例如,Matplotlib和Seaborn库在数据可视化领域内非常受欢迎,它们提供了广泛的工具和技术,可以创建高度定制化的图表和图形,帮助数据科学家和分析师在数据探索和结果展示中更有效地传达信息。

Global site tag (gtag.js) - Google Analytics