<< back to BigData BigData - Example 2 - Finding Info on Counties in Maryland We found a file with data about Counties in Maryland. It is a small dataset but it will allow us to show more techniques in working with data. 1. Open a new Mainstack, add a button "Load File" and add a scrolling text field
Download the file: Choose_Maryland___Compare_Counties_-_Demographics.csv Click on the button "Load File" and select the file that you just downloaded. You should see the following: We are only interested in the following fields: field 1 = County Name, field 8 = Median Age of residents and field 11 = Household Income So let us: - load the file into a variable which we will call temp, - then go through the data line by line - and put only what we want into the field "mydata" (fields 1,8 and 10) The code on the button "Load File" now becomes:
notes: 1) We clear out temp by putting 'empty" into it 2) Notice that we put data from each line "after" and not "into" put item 1 into field "mydata" - wipes out what ever was in there before. (it replaces it completely) put item 1 after field "mydata" - adds or appends it after what is there in the field So we want to use "after" to keep adding each county to the list 3) We include a & return which puts it on a new line. So each county is on a separate line Load the file again and it now looks like this: That looks so much simpler and cleaner Getting Rid of Bad Lines in the Data But if you scroll down to the bottom, you see this: We have some bad data. (Look at the last 4 lines in the screenshot above). It does not belong. Is there a way to not include that data? of course, Those lines have no other items (fields separated by commas) Add the following code to our "Load" button: on mouseUpanswer file "Pick a file to use"put it into myfile
repeat for each line x in tempif item 3 of x is not empty then This only uses full lines to get the data from
Too Much Data - It takes Forever to Load, or It Crashes...... What do I do? Getting Rid of Bad Lines in the Data Answer: Just load some of the Data until you get the code finished. 1. Add a counter to keep track of the lines. I called my variable "count" and create it by putting zero (0) into it put 0 into count 2. Increment it in the repeat code and once you load enough records, exit the repeat loop. repeat for each line x in temp if item 3 of x is not empty then put item 1 of x & " , " & item 8 of x & " , " & char 2 to 99 of item 10 of x & return after field "mydata" end if add 1 to count if count = 100 then exit repeat end repeat Change the number to whatever that you want - load only 1 line, 10 lines or 1,000 lines |
Big Data > BigData - Data >
