<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/'><id>tag:blogger.com,1999:blog-7006618857347978140.post3358580197097995994..comments</id><updated>2010-04-01T00:25:54.831-07:00</updated><title type='text'>Comments on Next year's trends reported today: Hadoop should target C++/LLVM, not Java (because o...</title><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.trendcaller.com/feeds/3358580197097995994/comments/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7006618857347978140/3358580197097995994/comments/default'/><link rel='alternate' type='text/html' href='http://www.trendcaller.com/2009/05/hadoop-should-target-cllvm-not-java.html'/><author><name>Kevin Lawton</name><uri>http://www.blogger.com/profile/03442192017196947120</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://3.bp.blogspot.com/_Fe2SEsPN47o/Syw7wMIMW-I/AAAAAAAAAIQ/Gywo_bZtoTs/S220/kevin_scaled96x96.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>2</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-7006618857347978140.post-7467576211159251264</id><published>2009-05-23T15:09:30.587-07:00</published><updated>2009-05-23T15:09:30.587-07:00</updated><title type='text'>Hi,

I wouldnt like to comment your concerns about...</title><content type='html'>Hi,&lt;br /&gt;&lt;br /&gt;I wouldnt like to comment your concerns about power consumption but I'd like to contribute with some ideas.&lt;br /&gt;&lt;br /&gt;1. If you consider RTSJ (Real Time System Java) you could use ITC (Initialization Time Compilation) instead of JIT. RTSJ can speed up your Java application too if you use "Soft Real Time Threads", which is not difficult to implement and can prevent GC to manage memory you can manage easily yourself (Scoped Memory).&lt;br /&gt;&lt;br /&gt;These links may be of your interest:&lt;br /&gt;&lt;br /&gt;* http://java.sun.com/javase/technologies/realtime/reference/doc_2.1/release/JavaRTSCompilation.html&lt;br /&gt;&lt;br /&gt;* http://www.rtsj.org/specjavadoc/book_index.html&lt;br /&gt;&lt;br /&gt;2. IBM has a very interesting research project called X10 which generates code in Java and/or C++ as output. The input language is something based on Scala (see release 1.7.x).&lt;br /&gt;You could use it to Write-Once- Run-Everywhere, does not matter if you have a JVM or your your native OS.&lt;br /&gt;&lt;br /&gt;A very interesting improvement over Scala is that X10 does not use MPI but it uses PGAS, which is beneficial as STM but provides maximum performance for local data.&lt;br /&gt;&lt;br /&gt;IBM X10 Language&lt;br /&gt;* http://www.x10-lang.org/&lt;br /&gt;* http://dist.codehaus.org/x10/documentation/languagespec/x10-173.pdf&lt;br /&gt;&lt;br /&gt;STM (Software Transactional Memory)&lt;br /&gt;* http://en.wikipedia.org/wiki/Software_transactional_memory&lt;br /&gt;&lt;br /&gt;PGAS (Partitioned Global Address Space)&lt;br /&gt;* http://en.wikipedia.org/wiki/Partitioned_global_address_space&lt;br /&gt;&lt;br /&gt;Regards&lt;br /&gt;&lt;br /&gt;Richard Gomes&lt;br /&gt;http://www.jquantlib.org/index.php/User:RichardGomes</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7006618857347978140/3358580197097995994/comments/default/7467576211159251264'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7006618857347978140/3358580197097995994/comments/default/7467576211159251264'/><link rel='alternate' type='text/html' href='http://www.trendcaller.com/2009/05/hadoop-should-target-cllvm-not-java.html?showComment=1243116570587#c7467576211159251264' title=''/><author><name>rgomes1997</name><uri>http://www.blogger.com/profile/10993828277309610363</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.trendcaller.com/2009/05/hadoop-should-target-cllvm-not-java.html' ref='tag:blogger.com,1999:blog-7006618857347978140.post-3358580197097995994' source='http://www.blogger.com/feeds/7006618857347978140/posts/default/3358580197097995994' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-240702470'/></entry><entry><id>tag:blogger.com,1999:blog-7006618857347978140.post-2998924720727313896</id><published>2009-05-13T06:20:00.000-07:00</published><updated>2009-05-13T06:20:00.000-07:00</updated><title type='text'>(I thought this was intriguing enough, and was pro...</title><content type='html'>(I thought this was intriguing enough, and was proud enough of my reply on this to the mahout-dev list, that I will favor you with a cross post here!)&lt;br /&gt;&lt;br /&gt;The difference in power consumption between a fully loaded machine and&lt;br /&gt;idle isn't so large (the figure 50% sticks in my head?), but the&lt;br /&gt;difference between a fully loaded and half-loaded machine is quite&lt;br /&gt;small. That is, if the hard disk is up, processor is at full speed,&lt;br /&gt;all memory is fully powered, then using all or most is not a big deal.&lt;br /&gt;Power consumption drops only if you are really idle.&lt;br /&gt;&lt;br /&gt;I don't have numbers to back this up at my fingertips, though they're&lt;br /&gt;informed by figures I've seen in the past. I think that's what one&lt;br /&gt;would need to evaluate this argument, and I have a different intuition&lt;br /&gt;about how much this could matter.&lt;br /&gt;&lt;br /&gt;The main argument here seems to be, basically, that Java competes well&lt;br /&gt;in wall-time performance by better parallelism and more memory usage.&lt;br /&gt;Maybe, that's an interesting question. Is LLVM going to be more&lt;br /&gt;efficient than Java? unclear, both have an overhead I suppose. But&lt;br /&gt;again interesting question.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;But, the topic really does matter. Wasting time means wasting energy,&lt;br /&gt;and when we get to distributed cluster scale, it matters to the&lt;br /&gt;environment. At Google they do a good job of keeping teams really&lt;br /&gt;clear about how much their operations are costing -- it is staggering&lt;br /&gt;sometimes. Developers who might run a big job, oops, see it fail,&lt;br /&gt;start it up again, oops, wrong argument again... might think twice&lt;br /&gt;when the realize how many pounds of CO2 their mistake just pumped into&lt;br /&gt;the atmosphere.&lt;br /&gt;&lt;br /&gt;(Mahout folks will now appreciate why I have been messing with the&lt;br /&gt;code all over to try to micro-optimize for performance. I think there&lt;br /&gt;is still not enough attention given to efficiency yet, but hey it's at&lt;br /&gt;0.1.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;And, I think I agree with the conclusion of the blog post for a&lt;br /&gt;different reason:&lt;br /&gt;&lt;br /&gt;The Java/C++ performance gap for most apps is pretty negligible these&lt;br /&gt;days. Why? I actually think given a fixed amount of *developer* time,&lt;br /&gt;one can make a faster Java app than C++ app. Why? I can develop&lt;br /&gt;faster, against a larger and more stable collection of libraries,&lt;br /&gt;spend less time debugging, leaving more time to optimize the result.&lt;br /&gt;&lt;br /&gt;But that does hit a certain plateau. Given enough developer time, I&lt;br /&gt;can get native code to run faster than even JITted Java. I myself am&lt;br /&gt;hard-pressed to optimize my code (Mahout - Taste) further in Java&lt;br /&gt;without drastic measures.&lt;br /&gt;&lt;br /&gt;It may take a lot of time to actually beat Java performance in C++,&lt;br /&gt;but, as the scale of your operations grows, the return on that 1%&lt;br /&gt;improvement you eke out grows. And of course -- when we talk about&lt;br /&gt;code headed for Hadoop, we are definitely talking about large-scale&lt;br /&gt;operations.&lt;br /&gt;&lt;br /&gt;For reference, of course, Google operates at such a scale that they&lt;br /&gt;use a C++-based MapReduce framework. It is just almost always&lt;br /&gt;worthwhile to spend the time to beat Java performance.&lt;br /&gt;&lt;br /&gt;This isn't going to be true of all users of distributed computing&lt;br /&gt;frameworks, so it's not inherently wrong that Hadoop is in Java, but,&lt;br /&gt;I did find myself saying "hmm, Java?" the first time I heard of&lt;br /&gt;Hadoop.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;But isn't this what this whole Hadoop streaming business is about?&lt;br /&gt;letting you farm out the computation itself to whatever native process&lt;br /&gt;you like and just using Hadoop for the management? because that of&lt;br /&gt;course is fine.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7006618857347978140/3358580197097995994/comments/default/2998924720727313896'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7006618857347978140/3358580197097995994/comments/default/2998924720727313896'/><link rel='alternate' type='text/html' href='http://www.trendcaller.com/2009/05/hadoop-should-target-cllvm-not-java.html?showComment=1242220800000#c2998924720727313896' title=''/><author><name>srowen</name><uri>http://www.blogger.com/profile/06524814758673314736</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.trendcaller.com/2009/05/hadoop-should-target-cllvm-not-java.html' ref='tag:blogger.com,1999:blog-7006618857347978140.post-3358580197097995994' source='http://www.blogger.com/feeds/7006618857347978140/posts/default/3358580197097995994' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1101502033'/></entry></feed>
