Big Data‎ > ‎BigData - Data‎ > ‎

BD MD Counties

<< back to BigData
BigData - Example 2 - Finding Info on Counties in Maryland
We found a file with data about Counties in Maryland. It is a small dataset but it will allow us to show more techniques in working with data.

    1. Open a new Mainstack, add a button "Load File" and add a scrolling text field

                                

        Add the following code to the button:

on mouseUp
   answer file "Pick a file to use"                     // for the user to select a file
   put it into myfile                                        // save the filename and its location (path) 
   put url ("file:" & myfile) into field "mydata"   // load the contents of the file into the field
end mouseUp

        Download the file:
                                 Choose_Maryland___Compare_Counties_-_Demographics.csv

        Click on the button "Load File" and select the file that you just downloaded. You should see the following:  

                             

        We are only interested in the following fields: 
            field 1 = County Name, field 8 = Median Age of residents and field 11 = Household Income
        
        So let us:
            - load the file into a variable which we will call temp, 
            - then go through the data line by line
            and put only what we want into the field "mydata" (fields 1,8 and 10)

        The code on the button "Load File" now becomes:

on mouseUp
   answer file "Pick a file to use"
   put it into myfile
   put url ("file:" & myfile) into temp                // load file into the variable temp

   put empty into field "mydata"                      

   repeat for each line x in temp
        put item 1 of x & " ,   " & item 8 of x & "  ,  " & item 10 of x & return after field "mydata"
   end repeat
end mouseUp

        notes:
            1) We clear out temp by putting 'empty" into it
            2) Notice that we put data from each line "after" and not "into"
                    put item 1 into field "mydata" - wipes out what ever was in there before. (it replaces it completely)    
                    put item 1 after field "mydata" - adds or appends it after what is there in the field
                So we want to use "after" to keep adding each county to the list
            3) We include a & return which puts it on a new line. So each county is on a separate line
                
        Load the file again and it now looks like this:

                              

        That looks so much simpler and cleaner

Getting Rid of Bad Lines in the Data

        But if you scroll down to the bottom, you see this:

                                    

    We have some bad data. (Look at the last 4 lines in the screenshot above). It does not belong. Is there a way to not include that data?

        of course, Those lines have no other items (fields separated by commas)

        Add the following code to our "Load" button:


on mouseUp
   answer file "Pick a file to use"
   put it into myfile
   put url ("file:" & myfile) into temp                // load file into the variable temp

   put empty into field "mydata"                      

   repeat for each line x in temp
        if item 3 of x is not empty then
            put item 1 of x & " ,   " & item 8 of x & "  ,  " & item 10 of x & return after field "mydata"
        end if
   end repeat
end mouseUp


  This only uses full lines to get the data from



   put 0 into count
                     
   repeat for each line x in temp
      if item 3 of x is not empty then
         put item 1 of x & " ,   " & item 8 of x & "  ,  " & char 2 to 99 of item 10 of x & return after field "mydata"
      end if
      add 1 to count
      if count = 100 then exit repeat
   end repeat


Too Much Data - It takes Forever to Load, or It Crashes...... What do I do?

    Getting Rid of Bad Lines in the Data

        Answer: Just load some of the Data until you get the code finished.

    1. Add a counter to keep track of the lines. I called my variable "count" and create it by putting zero (0) into it
            
                put 0 into count
                     
    2. Increment it in the repeat code and once you load enough records, exit the repeat loop.
    
               repeat for each line x in temp
                      if item 3 of x is not empty then
                             put item 1 of x & " ,   " & item 8 of x & "  ,  " & char 2 to 99 of item 10 of x & return after field "mydata"
                      end if
                      add 1 to count
                      if count = 100 then exit repeat
               end repeat

    Change the number to whatever that you want - load only 1 line, 10 lines or 1,000 lines


ċ
cyril.pruszko@pgcps.org,
Dec 5, 2016, 7:48 AM
Comments