BD MD Counties

BigData - Example 2 - Finding Info on Counties in Maryland

We found a file with data about Counties in Maryland. It is a small dataset but it will allow us to show more techniques in working with data.

1. Open a new Mainstack, add a button "Load File" and add a scrolling text field

Add the following code to the button:

on mouseUp
answer file "Pick a file to use" // for the user to select a file
put it into myfile // save the filename and its location (path)
put url ("file:" & myfile) into field "mydata" // load the contents of the file into the field
end mouseUp

Download the file:

Choose_Maryland___Compare_Counties_-_Demographics.csv

Click on the button "Load File" and select the file that you just downloaded. You should see the following:

We are only interested in the following fields:

field 1 = County Name, field 8 = Median Age of residents and field 11 = Household Income

So let us:

- load the file into a variable which we will call temp,

- then go through the data line by line

- and put only what we want into the field "mydata" (fields 1,8 and 10)

The code on the button "Load File" now becomes:

on mouseUp
answer file "Pick a file to use"
put it into myfile
put url ("file:" & myfile) into temp // load file into the variable temp

put empty into field "mydata"

repeat for each line x in temp
put item 1 of x & " , " & item 8 of x & " , " & item 10 of x & return after field "mydata"
end repeat
end mouseUp

notes:

1) We clear out temp by putting 'empty" into it

2) Notice that we put data from each line "after" and not "into"

put item 1 into field "mydata" - wipes out what ever was in there before. (it replaces it completely)

put item 1 after field "mydata" - adds or appends it after what is there in the field

So we want to use "after" to keep adding each county to the list

3) We include a & return which puts it on a new line. So each county is on a separate line

Load the file again and it now looks like this:

That looks so much simpler and cleaner

Getting Rid of Bad Lines in the Data

But if you scroll down to the bottom, you see this:

We have some bad data. (Look at the last 4 lines in the screenshot above). It does not belong. Is there a way to not include that data?

of course, Those lines have no other items (fields separated by commas)

Add the following code to our "Load" button:

on mouseUp
answer file "Pick a file to use"
put it into myfile
put url ("file:" & myfile) into temp // load file into the variable temp

put empty into field "mydata"

repeat for each line x in temp
if item 3 of x is not empty then
put item 1 of x & " , " & item 8 of x & " , " & item 10 of x & return after field "mydata"
end if
end repeat
end mouseUp

This only uses full lines to get the data from

put 0 into count

repeat for each line x in temp
if item 3 of x is not empty then
put item 1 of x & " , " & item 8 of x & " , " & char 2 to 99 of item 10 of x & return after field "mydata"
end if
add 1 to count
if count = 100 then exit repeat
end repeat

Too Much Data - It takes Forever to Load, or It Crashes...... What do I do?

Getting Rid of Bad Lines in the Data

Answer: Just load some of the Data until you get the code finished.

1. Add a counter to keep track of the lines. I called my variable "count" and create it by putting zero (0) into it

put 0 into count

2. Increment it in the repeat code and once you load enough records, exit the repeat loop.

repeat for each line x in temp

if item 3 of x is not empty then

put item 1 of x & " , " & item 8 of x & " , " & char 2 to 99 of item 10 of x & return after field "mydata"

end if

add 1 to count

if count = 100 then exit repeat

end repeat

Change the number to whatever that you want - load only 1 line, 10 lines or 1,000 lines

Choose_Maryland___Compare_Counties_-_Demographics.csv

(2k)

cyril.pruszko@pgcps.org,

Dec 5, 2016, 7:48 AM

v.1

Comments

LiveCode	Search this site