Fall 2005 Worked on an independent study with Tim McLarnan. Implemented a first draft of parallel primegen code (http://cs.earlham.edu/~lemanal/cs488/parallelprimegen). Summer 2005 Began looking at visualization of the results from parallelprimegen. I had nothing but problems. Sep 7 2005 Wrote first draft of an abstract (http://cs.earlham.edu/~lemanal/cs488/abstract.txt). Looked at QtiPlot(soft.proindependent.com/qtiplot.html) as a possible graphing solution. Sep 14 2005 Wrote second draft of abstract (http://cs.earlham.edu/~lemanal/cs488/abstract.tex). Idea: compress the output data. gzip,bzip work on streams. ppm - Prediction by Partial Matching. Sep 15 2005 Read a bit about libbzip2: http://www.bzip.org/1.0.3/bzip2-manual-1.0.3.html#top-level bzip and gzip each take an argument -1 through -9 depending on whether it will optimize for speed(maybe memory) or time. -rw------- 1 lemanal users 35M Sep 15 18:28 data.clean -rw------- 1 lemanal users 8.7M Sep 15 18:34 data-1.bz2 -rw------- 1 lemanal users 8.4M Sep 15 18:29 data-9.bz2 -rw------- 1 lemanal users 11M May 17 16:49 data-9.gz -rw------- 1 lemanal users 14M Sep 15 18:38 data-1.gz 31% for gzip-9 24% for bzip2-9 ppm searches didn't yield much: http://compression.ru/ds/ http://sourceforge.net/projects/pzip Sep 16 2005 Updated first post with a link to my preliminary code. Sep 20 2005 How will compression make check pointing and recovery difficult? Sep 28 2005 I remembered what charlie said about binary representation. maybe this will make the compression less important? Also, I have a first draft of my outline up and the next draft of my abstract: http://cs.earlham.edu/~lemanal/cs488/outline.ps http://cs.earlham.edu/~lemanal/cs488/abstract.txt And, I looked at storing the primes in the light of the fact that all twin primes are either 6x+1 or 6x-1. I can at least save a factor of 2. I tried to start thinking about how big the text file should be considering estimations of pi_2(x) (# of twin primes up to x). Sep 30 2005 Updated my documentation to include a bibliography and code outline. http://cs.earlham.edu/~lemanal/bib.txt http://cs.earlham.edu/~lemanal/codeoutline.txt Oct 4 2005 I met with Tim to discuss an analytical description of how much data is being stored. The (very) rough draft is up in http://cs.earlham.edu/~lemanal/cs488/paper.tex http://cs.earlham.edu/~lemanal/cs488/paper.ps Oct 17 2005 Talked with Kevin Hunter for a long while about storing primes. We went over the specifics of the concise storage method from the paper. He helped me come to terms with some implementation details. Specifically, we talked about storing everything in a char pointer (minimal addressable space) and casting an offset of that that pointer to some other pointer type when copying in data of a different type into that array. Also, the data about the type we are storing can be stored in two bytes. There are 4 different addressable memory sizes (char,short,int,long). That's simpler than the shifting I imagined. Oct 20 2005 Began code based on last entry: http://cs.earlham.edu/~lemanal/cs488/shrink.c Ran into a couple of problems (from an email to charlie): Here's an example for shorts (2 bytes): /* 16384 = 2^(16-2) */ /* 0100 0000 0000 0000 = 16384 */ else if(input[i] <= 16384) { tmp = output + byte_count; *((unsigned short *)tmp) = (unsigned short)input[i] + 16384; byte_count = byte_count + sizeof(unsigned short); } This means that if we have a value that fits in two bytes using pointer arithmetic and appending the 01 to the front designating it as a short. For storing 300 this means: 300+16384 = 412C The char array looks like: output[byte_count] = 2C output[byte_coiunt+1] = 41 My problem with this is that it is stores the least significant bits first. My record size designator (01) is in the two most significant digits. But, I won't know where to find them if I don't know how many bytes the record is stored in. I guess I could right shift the value I want to store and add 2 rather than 16384, but I'm worried that I'm delving into terribly platform dependent code. Do you have any suggestions for keeping this clean and simple? END OF EMAIL Another problem was that if the input is unsigned long long indexed then the byte array will run out of indices before it gets to the end. A possible solution to this would be having the shrink function return an array of ULL indexed byte arrays. I verified with maple (not automated yet) that I am missing data. This is pretty frustrating. I have to go back and work on my old code before moving forward. Oct 21 2005 Implemented the byte packing with ANDs and shifts rather than pointer arithmetic this side steps the byte ordering problems. Oct 25 2005 Began refactoring code in an attempt to debug. Oct 27 2005 Apparently the primegen code doesn't work very well (at least on cairo). After lots of gdbing tracking down errors I narrowed it down to my function on the client that fills an array with primes to send to the server. Stepping through it in gdb showed me that primegen thinks 299(=13*23) is prime! from gdb: 7: data[primeCount - 1] = 281 6: thisPrime = 307 5: lastPrime = 299 4: primeCount = 19 further investigations: ACL: [lemanal@acl10 primegen-0.97]$ ./primes 1 100 > primes.out [lemanal@acl10 primegen-0.97]$ md5sum ./primes.out 45d886e08500b82881519d5cf5cbe1d6 ./primes.out [lemanal@acl10 primegen-0.97]$ ./primes 1 1000 > primes.out [lemanal@acl10 primegen-0.97]$ md5sum ./primes.out 2d2382f376350089fd94503d7da478db ./primes.out Cairo: [lemanal@c0 primegen-0.97]$ ./primes 1 100 > primes.out [lemanal@c0 primegen-0.97]$ md5sum ./primes.out 45d886e08500b82881519d5cf5cbe1d6 ./primes.out [lemanal@c0 primegen-0.97]$ ./primes 1 1000 > primes.out [lemanal@c0 primegen-0.97]$ md5sum ./primes.out a91989f0d1317bfd2011159ee732d081 ./primes.out That is the md5sums match for the primes from 1-100, but not from 1-1000. also using the supplied primegaps code: [lemanal@c0 primegen-0.97]$ ./primegaps 5 2 1.456543 11 4 1.585077 29 6 1.475780 97 8 1.367565 127 14 1.672640 487 24 1.743640 [lemanal@acl10 primegen-0.97]$ ./primegaps 5 2 1.456543 11 4 1.585077 29 6 1.475780 97 8 1.367565 127 14 1.672640 541 18 1.571277 Oct 28 2005 struggled with bitwise shifts and &s while still working on packing and unpacking my data. They operations seem to be giving strange results for 64 bit types. I forgot to mention that last time I also wrote maple code to find the next twin prime and do a twin prime count up to n. I can use this in the future for testing. If my code and maple return the same number of twins up to n then they are probably giving the same twins. Oct 31 2005 Worked on the paper. Got shrink/enlarge to work finally: http://cs.earlham.edu/~lemanal/cs488/shrink2.c Cleaned up old graphing code to work with gnuplot. I am still getting strange results. I compared it with excel which looked normal. I need to read more about gnuplot and how to weight the graph. Nov 1 2005 Worked a bunch on the paper text and diagrams. **backlog last week I talked to Tim about my plots. I was doing some of the weighting incorrectly, namely I needed to be dividing my weighted counts by log(x). The next problem I had was my attempt to do the weighting between all residues in a class rather than between two. I have changed both of these problems. I talked to the class about my problem of receiving data in the wrong order from the worker nodes. The solution proposed was to store the actual primes and an index into the file. I would sort the prime list and traverse them in order looking at the next block in the file based on the sorted prime list using the indices. Nov 9 2005 Created slides.tex for presentation. Nov 10 2005 I've been working on trying to create good plots using gnuplot. I have code that produces data files and a gnuplot file given a residue class and output from parallelprimegen. I'm using data from runs with 2 nodes, the assignment server and a worker which means that it will be well ordered. I think I'm having trouble my log function. I need a double log(unsigned long long) function. I'm sort of thinking about using gmp for arbitrary precision reals for the weighted counts. They should also provide methods for getting them from logs,adds,divides,and multiplies of lots of different types (i hope)...... GMP doesn't support a log function because of rounding (maybe). gmpfr is a related project (gnu multiprecision floating point rounding). GMP on the ACLs is not compiled with --enable-gmpfr. Nov 22 2005 I've been working on calculating the twin prime counts. Today I think I got it working for the unpacked results file. It takes a lot more time than I thought it would. This makes me nervous that the hard part of this problem might be calculating the weights rather than finding the primes. I've moved to working on packing the results in the data generation part of the project. This is showing very promising results with significantly smaller file sizes. Storing just primes in ASCII on all of cairo: 1000000000 1m7.898s 45M 10000000000 4m16.124s 409M 100000000000 49m21.277s 3.8G 1000000000000 ? 35G Storing differences in ASCII on all of cairo: 10000000000 0m51.536s 140M 100000000000 7m7.120s 1.2G 1000000000000 59m23s 11G packed differences on bazaar with ten nodes: 1000000000 0m4.278s 6M 10000000000 0m24.998s 49M 100000000000 6m44.256s 403M 1000000000000 ?28m17.574s? 3.3G /* time calculated though I ran out of disk */ /* got size from finished calculation on ACls*/ 1000000000000 84m36.625s 3.4G /* finished on bazaar */ 10000000000000 >432m58.242s >11G Nov 23 2005 Ok. The data output stage looks good. I'm storing packed differences like I think I should be doing. I _am_ having some troubles on the ACLs, MPI dies with this error: MPI_Recv: message truncated (rank 0, MPI_COMM_WORLD) I'm having trouble diagnosing this problem because it does not occur when running under gdb. Under gdb my code runs fine, like it does on bazaar. I moved to the ACLs because I ran out of space on the local disks on bazaar nodes. I'm going to start writing the data file block offset sort. Once I have a program that outputs the primes in order I should be able to plug it into my weighting program and get nice graphs. I'm concerned that the weight calculations that I have been doing is taking a lot of time on small datasets and producing large gnuplot data files. Maybe I parallelized the wrong portion of this project. Nov 24 Everything see ms to be working well. I'm generating a data set out to 10^12 on bazaar and analzying data out to 10^11 on an ACL. I estimated the time to calculate the weights on the ACL to be about 24 hours per residue comparison within a class (there are six of these for the mod 8 class). I am using percision of 10 decimal places for my floating point operations (saving time) and only storing every 20th data point (saving space). This estimate does not include the time to graph. It's harder to estimate the output size because of the dynamic record size I might run out of disk space. Hopefully this will get done before my paper is due. I'm not sure if calculating weights to 10^12 will be resonable. Nov 25 Ran into a problem in my analysis code. The offsets got bigger than 2G on my 3G file. The required solution required: #define _FILE_OFFSET_BITS 64 and using fseeko instead of fseek.