Geez, it's been a while. I'm ashamed.
I started writing a Perl script to process input files, connect to the PostgreSQL databse using DBI, query the tract info, and to store entries (including census tract info) back into the database. So far, the script does everything but record entries.
Some key points of the script:
my $tract_sth = $dbh->prepare("SELECT t.tract FROM cshape t WHERE ST_Contains(t.geom, GeomFromText(?)) = TRUE");Prepare, uh, prepares the query for execution later so that a specific point can be inserted where the question mark is.
$lon =~ m/^(\d{3})(\d{2}\.\d{4})/; ### grab degrees and minutes
$deg = $1; $min = $2;
$londeg = $deg + $min / 60.0; ## do the conversion
$londeg =~ s/^(\d*\.\d{8}).*/$1/; ## limit decimals to eight numbers
if($line_segments[6] =~ m/W/) { $londeg = -$londeg; }
This segment of the code grabs the degrees and minutes, divides minutes by 60 and adds to degrees to convert to decimal degres, then using Regex, limits decimals to eight numbers. Finally, it parses a segment of the GPRMC data to see if the value should be positive or negative. This all is only possible because of Perl's built-in ability to analyze strings as both numbers and strings on-the-fly.
Now, all we need is for time, date, and census tract info to be stored in a new table. After that's done, the Processing app will query the database and find all results between certain time/dates. (The time/dates will be indexed?) A list of census tracts will have values that increment for each second of time spent in a tract. The Processing application will get the first and second query, figure out the difference in time, and increment the counter for the census tract corresponding to that amount of time. Then, the application will get the second and third... and so on and so forth. After that all, the Processing application will get census info for the tracts (also stored in a database table!) and calculate corresponding data.
Handling transportation will come later, but will consist of looking for sudden gaps of time over a threshold, and dividing the change in distance by the change of time. Then, we will able to get the average location for certain points in time.
Posted by provolot on August 13, 2007 3:06 AM | Comments (0) | TrackBacks (0)
