A new taxi company hired an advertising agency to advertise their services on screens at Times Square in New York, NY. The marketing company was tasked to identify the five best screens for their client. In order to reach the maximum number of potential clients for the new taxi company the criterion they decided to use was the average number of taxi pickups in close proximity to an advertising screen.
The marketing company found two public datasets that they are going to use:
1. The list of screens at the Time Square Download The list of screens at the Time Square:
The illustration above (which was created by importing a given dataset to Google Maps) visualizes the locations of the screens.
Large datasets like this one are usually consist of a) data dictionary, a table that lists all the fields in the dataset; b) and the actual dataset in a variety of formats (Excel compatible comma separated values (.cvs), XML or JSON).
Large datasets like this one usually consist of a) data dictionary, a table that lists all the fields in the dataset; b) and the actual dataset in a variety of formats (Excel compatible comma-separated values (.cvs), XML or JSON).
Familiarize yourself with both datasets. Note, that the second dataset files are very large (up to 1 GB).
The ridership data is also given in separate files grouped by taxi companies (e.g. Yellow). Pick a dataset related to any company that services the Time Square area.
With the dataset structures (field names, or dictionaries) in mind, use Word to design a flow chart of the algorithm to describe the process of identifying the top five screens that would be seen most often by the taxi riders.
Note that you don’t need to provide code, and you don’t need to calculate top screens, just provide a pseudo code for the algorithm that would perform that task.
Pseudocode is a somewhat structured description of the steps of an algorithm written in plain English. You may also use variable names to refer to the same data multiple times if needed.
Assume we need to identify the list of courses a top student in a current class has to take to graduate. We have three datasets: (1) one dataset has students grades in the current class (student_id, total_grade); (2) another one has the list of past enrollments as a (student_id, course_id); (3) list, of course, ids required by a program.
1. Sort the dataset (1) by the total_grade values (descending)
2. Select the first element of the sorted list and save it as a top_student_id
3. Select all pairs from the dataset (2) where student_id equals top_student_id store course_id in a new list (4)
4. Each course in (3) which is not present in (4) should be added to a new list (5).
Then the new list (5) is the list of the courses that the top student has to take.
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.Read more
Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.Read more
Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.Read more
Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.Read more
By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.Read more