[spark-src-core] 2.3 shuffle in spark

博客分类：

spark

1.flow 1.1 shuffle abstract 1.2 shuffle flow 1.3 sort flow in shuffle 1.4 data structure in mem 2.core code paths //SortShuffleWriter override def write(records: Iterator[Product2[K, V]]): Unit = { //-how to collect this result by partition?by index file //-1 sort ...

2016-08-25 16:31
浏览 365
评论(0)
分类:开源软件

[spark-src-core] 2.2 job submitted flow for local mode-part II

博客分类：

spark

in this section,we will verify that how does spark collect data from prevous stage to next stage(result task) figure after finishing ShuffleMapTask computation(ie post process ).note:the last method 'reviveOffers()' is redundant in this mode as the step 13 will setup next stage(reuslttask ...

2016-08-25 11:23
浏览 495
评论(0)
分类:开源软件

[spark-src-core] 2.2 job submitted flow for local mode-part I

博客分类：

spark

now we will dive into spark internal as per this simple example(wordcount,later articles will reference this one by default) below sparkConf.setMaster("local[2]") //-local[*] by default //leib-confs:output all the dependencies logs sparkConf.set("spark.logLineage","tru ...

2016-08-24 17:36
浏览 514
评论(0)
分类:开源软件

[spark-src-core] 2.1 relationships b/t misc spark shells

博客分类：

spark

similar to other open source projects,spark has several shells are listed there sbin server side shells start-all.sh start the whole spark daemons (ie. start-master.sh,start-slaves.sh) start-master.sh startup the spark's master process deliver to "spark-daemon.sh ...

2016-06-01 16:01
浏览 465
评论(0)
分类:开源软件

scala- Scala对象比较==、eq、ne与java==、equals()

如果你想比较一下看看两个对象是否相等，可以使用或者==，或它的反义 !=。（对所有对象都适用，而不仅仅是基本数据类型）

2016-04-22 15:08
浏览 631
评论(0)
分类:开源软件

[spark-src-core] given SPARK_PRINT_LAUNCH_COMMAND to output more details

with enabling both system environment 'SPARK_PRINT_LAUNCH_COMMAND' and --verbose ,the spark command is more detailed that outputed from spark-submit.sh: hadoop@GZsw04:~/spark/spark-1.4.1-bin-hadoop2.4$ spark-submit --master yarn --verbose --class org.apache.spark.examples.JavaWordCount lib/spark ...

2016-04-19 12:19
浏览 1521
评论(0)
分类:开源软件

scala- type conversion( classOf ,asInstanceOf,isInstanceOf)

博客分类：

scala

ref :scala object 转Class Scala强制类型转换

2016-04-14 15:28
浏览 466
评论(0)
分类:开源软件

[spark-src] 1-overview

博客分类：

spark

what is "Apache Spark™ is a fast and general engine for large-scale data processing....Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk." stated in apache spark in despite of it's real a fact or not, i think certain key concepts/components to ...

2016-03-20 16:20
浏览 564
评论(0)
分类:开源软件

[spark-src]-source reading

博客分类：

spark

base on : spark-1.4.1 hadoop-2.5.2 Base from simpleness to complexity and working flow principle,we conform to these steps: 1.[spark-src] spark overview 2.[spark-src] core from basic demos to dive into spark internal.this section will envolve many components,so it's much detail ...

2016-03-20 15:06
浏览 790
评论(0)
分类:开源软件

free talk-intelligent period prediction is undergoing

博客分类：

free talking

google AlphaGo vs Lee on 'the game of go' VS 回广州了，再战江湖 cheers

2016-03-16 10:15
浏览 449
评论(0)
分类:非技术

zookeeper-network partition

博客分类：

hbase
zookeeper

env: hbase,94.26 zookeeper,3.4.3 --------------- 1.downed node this morning we found a regionserver(host-34) downed in our monitor.so we dived into the logs of hbase and found that in this host: 2016-02-29 00:50:36,799 INFO [regionserver60020-SendThread(host-04:2181)] ClientCnxn.java:108 ...

2016-02-29 17:36
浏览 1818
评论(0)
分类:开源软件

spark stream-Spark Streaming：大规模流式数据处理的新贵

博客分类：

spark

spark stream lineage ref: Spark Streaming：大规模流式数据处理的新贵

2016-02-24 17:10
浏览 532
评论(0)
分类:开源软件

hbase-qq study group

hbase qq学习交流群476390228 ,专注hbase技术交流,但也不排斥nosql相关数据库探讨,所谓举一反三 cheers

2016-02-06 10:58
浏览 310
评论(0)
论坛回复 / 浏览 (0 / 890)
分类:行业应用

spark-missingrequirementerror [spark bug]

when i do some sql-related certain simple test programs,this exception occurs to me.although it seems weird. (used spark-1.3.1 for project needness) scala.reflect.internal.MissingRequirementError: class org.apache.spark.sql.catalyst.ScalaReflection in JavaMirror at scala.reflect.internal. ...

2016-01-21 00:32
浏览 808
评论(0)
分类:开源软件

hbase-export table to json file

博客分类：

hbase
hadoop

i wanna export a table to json format files,but after gging,nothing solutions found.i known,pig is used to do soome sql like mapreduces stuff; and hive is a dataware to build on hbase.but i cant some soutions /wordaround to do that too( maybe i miss something) so i consider to use mr to figure ...

2015-12-25 17:21
浏览 1639
评论(0)
分类:编程语言

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

[spark-src-core] 2.3 shuffle in spark

[spark-src-core] 2.2 job submitted flow for local mode-part II

[spark-src-core] 2.2 job submitted flow for local mode-part I

[spark-src-core] 2.1 relationships b/t misc spark shells

scala- Scala对象比较==、eq、ne与java==、equals()

[spark-src-core] given SPARK_PRINT_LAUNCH_COMMAND to output more details

scala- type conversion( classOf ,asInstanceOf,isInstanceOf)

[spark-src] 1-overview

[spark-src]-source reading

free talk-intelligent period prediction is undergoing

zookeeper-network partition

spark stream-Spark Streaming：大规模流式数据处理的新贵

hbase-qq study group

spark-missingrequirementerror [spark bug]

hbase-export table to json file

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

最近访客更多访客>>