DSC 190 – Advanced Algorithms for Data Scientists


⚖️ Syllabus

Welcome to DSC 190 in Fall 2023! This page should answer most of the questions you might have about how the course is run; check out the frequently asked questions for answers to some common ones. If you don't find what you're looking for here, feel free to make a post on Campuswire.

Here is what the syllabus will cover:

Instructor

  • Dr. Justin Eldridge
    jeldridge@ucsd.edu
    webpage
    Lecture: 5:00 PM on T/Th in WLH 2204
    Discussion: 5:00 PM on Friday in MOS 204

Getting Started

To get started in DSC 190, you'll need to set up accounts on a couple of websites.

Campuswire

We'll be using Campuswire as our course message board. You should have received an invitation via email, but if not you should be able to join by clicking the link above and using the access code 0734. Be sure sure to join Campuswire as soon as possible, since all course communication will be done through it.

If you have a question about anything to do with the course — if you're stuck on a homework problem, want clarification on the logistics, or just have a general question about data science — you can make a post on Campuswire. We only ask that if your question includes some or all of an answer, please make your post private so that others cannot see it. You can also post anonymously if you would prefer.

Course staff will regularly check Campuswire and try to answer any questions that you have. You're also encouraged to answer a question asked by another student if you feel that you know the answer.

Gradescope

We'll be using Gradescope for homework submission and grading. Most of the assignments will be a mixture of math and coding, and the coding parts are usually autograded via Gradescope., You should have received an email invitation for Gradescope, but if not you can join with code XX2JJN.

Canvas

We will not be using Canvas. All course materials will be available at dsc190.com or Gradescope.

Required Materials

You will not need to purchase any materials for this course; we'll use lecture slides the main resource, as well as our own course notes. If you'd like additional textbooks to study from, we can recommend these:

  • Dasgupta, Papadimitriou, Vazirani; Algorithms
  • Cormen, Leiserson, Rivest, Stein; Introduction to Algorithms

These books are also excellent resources for interview preparation.

Lectures

Lectures will be held in-person at the regularly-scheduled time and place, but they will be podcasted and posted online for remote viewing. Attendance is appreciated, but not required.

Since there are two sections of the course, there will be two different lecture times, but they will cover the same content on the same schedule. The lecture times are: 5:00 PM on T/Th in WLH 2204.

You may attend whichever lecture section you would like after Week 02.

You will be able to find the lecture recordings at podcast.ucsd.edu.

Office Hours

Course staff, including tutors, TAs, and instructors, will hold office hours regularly throughout the week. Please see the office hours page for the schedule and for instructions.

Discussions

Since there are two sections of the course, there are two discussion times, but they will cover the same content on the same schedule.

The discussion times are: 5:00 PM on Friday in MOS 204.

The discussions review the materials from that week's lectures and prepare you for the homework. Just as with lecture, topics and techniques introduced in discussion might appear on the homework and in exams. In particular, some of the more difficult homework problems may be partially solved in discussion section to give you a good start.

Discussions will also serve as midterm reviews in the weeks leading up to the exams.

Attendance is recommended, but not required. The discussions will be podcasted, but the nature of discussion section (they usually involve a large amount of groupwork) means that the podcasted discussion might not be as useful as in-person attendance.

Labs

There will be two types of assignments in DSC 190: labs and homeworks. Labs help develop essential knowledge, while homeworks test your ability to apply that knowledge to solve more difficult problems. You can think of labs as a quick check on your understanding before you head into the homework.

Labs consist of a small number of autograded multiple choice or numerical answer questions. They will be posted on Gradescope weekly. The exams will mostly consist of questions of a similar format and difficulty as those on the labs. However, the exams will have a time limit, while the labs have no time limit.

In previous iterations of DSC 190, these "essential" questions were actually a part of the homeworks. We have decided to move these essential problems to their own lab assignment, therefore making the homeworks shorter. This has a big benefit: because the labs are autograded and due before the homeworks, you'll get your lab grade before heading into the homework. This gives you an opportunity to patch up any misunderstandings.

Lab Grading

You should think of the labs as a first practice towards the goal of mastering the topics in DSC 190. But the first time you practice anything, you're not going to be perfect. The key is to learn from the mistakes.

In other classes, like DSC 40B, I encourage this with the concept of lab redemption, where you can re-earn lost credit by discussing your misconceptions with a tutor or TA. However, we only have one TA for this class, and they will be too busy with homework grading and discussions to process hundreds of redemption request per week. This means that the usual lab redemption policy isn't feasible.

Instead of redemption, we'll use a grading formula that is forgiving of mistakes to determine your lab score at the end of the quarter:

overall lab score = (lab points earned during the quarter) / (80% of lab points available during the quarter)

In other words, in order to get a score of 100% on the lab component of the class, you need to get only 80% of the lab questions right throughout the quarter. Note that the overall lab score will be capped at 100%, so the actual formula is:

overall lab score = min { (lab points earned during the quarter) / (80% of lab points available during the quarter), 1 }

This has a similar effect to the usual lab redemption policy in that it is forgiving of mistakes, but it also means that you won't be talking to a TA about your misconceptions (which is meant to be a tool for learning). You're encouraged to review each week's lab and think critically about why you missed the questions you got wrong. Remember: the exams will have questions that are very similar in nature to the labs.

Homeworks

There will be eight homeworks assigned throughout the quarter, plus one "super homework" (described below). Homeworks will be a mixture of written problems (which are manually graded by our tutor staff) and coding problems (which are autograded). Each homework will be due via Gradescope at 11:59 PM on the Monday after it is assigned except otherwise noted, and you'll have roughly a week to complete each assignment from the time it is posted.

The homework due date is carefully chosen to fit within a one week "cycle". A "week" in DSC 190 will start with Tuesday's lecture, followed by Thursday's. That week's discussion on Friday will review the lecture topics with an eye towards practical application. The lab is then due on Thursday, giving you some practice before the homework. The homework is then due on the next Monday, giving you some time after the discussion and lab to complete it.

The lowest homework score is dropped. If a homework is dropped, all parts of it (programming problems, etc.) are removed from the calculation for your homework score. The homework that is dropped is chosen to maximize your overall homework score. The Super Homework cannot be dropped.

Regrade Requests

If you feel that the grader has made a mistake, you may submit a regrade request via Gradescope within one week of the grades being released. Note that part of your grade is clarity, so if your answer was mostly right but unclear you may still not receive full credit.

Note that regrade requests are not the same thing as redemption requests (though both are submitted on Gradescope in the same way). Unfortunately, we cannot offer redemption requests for homework problems as we do with lab problems — homework problems are typically more complex and require more time to grade, and regrading them would take more resources than we have available.

Catastrophic regrades

If your code causes the autograder to fail because of a missing import, typo, or other small error, you can ask the TA for a catastrophic regrade. You should provide the TA with fixed code (limited to changing, adding, or removing four or fewer lines of code) to submit on your behalf. Catastrophic regrades are not intended for fixing logical errors in your code, even if they're small.

Performing catastrophic regrades is time-intensive for us, and we can't afford to do many of them. Therefore, we limit you to requesting a maximum of two catastrophic regrades per quarter. Please submit your catastrophic regrades within one week of the grades being released.

Unlike in real life, catastrophes are avoidable in DSC 190 by paying close attention to the autograder's output after submitting your code -- it will tell you if your file was named incorrectly, your imports do not exist, etc.

The "Super Homework"

Instead of a comprehensive final exam, we'll have a comprehensive "Super Homework". The super homework will focus on the content from the last two weeks of the quarter, but it will also contain material from throughout DSC 190. It will be about twice as long as a typical homework.

Because the super homework covers twice as much material as a usual homework, it will be worth roughly twice as much. However, you may still collaborate on the super homework as long as you write up solutions in your own words.

The super homework will be due during finals week (the exact date is yet to be determined).

Collaboration and AI

You are highly encouraged to think about the lab and homework problems together, but you must turn in your own solutions written in your own words. We feel that discussing homework problems is an excellent way to learn, but writing the solutions in your own words promotes a deeper, more solid understanding than discussion alone.

We recommend the following way of working on the labs and homeworks. First, meet with your partner to discuss the solutions, but don't leave the meeting with anything written down. Wait an hour or so, then write up the solutions in your own words working from memory. In that hour, you inevitably forgot some of the details of the solution. If you find that you have trouble filling them in, its a sign that you might not have understood the solution as well as you first thought!

You're also encouraged to use AI (ChatGPT, etc.) in a similar way: you can talk to ChatGPT about a problem, but don't copy its answer verbatim. Instead, wait about an hour and put the answer in your own words. Keep in mind that ChatGPT is infamous for being very confidently wrong, so be critical of its output. Also keep in mind that you won't have ChatGPT on the exams, so you'll need to understand the fundamental concepts for yourself in order to do well.

If you have any questions or worries about whether your collaboration constitutes a violation of academic integrity, feel free to ask us on Campuswire.

Slip Days

You have five slip days to use throughout the quarter on any lab or homework (including the super homework). A slip day extends the deadline by 24 hours. Slip days cannot be "stacked" or "combined" to extend the deadline further — the latest any assignment can be submitted is 24 hours after the deadline. Slip days are applied automatically at the end of the quarter, but it's your responsibility to keep track of how many you have left.

Slip days are designed to be a transparent and predictable source of leniency in deadlines. You can use a slip day if you are too busy to complete an assignment on its original due date (or if you forgot about it). But slips days are also meant for things like the internet going down at 11:58 PM just as you go to submit your homework. Slip days are to be used in exceptional circumstances, so you probably shouldn't get close to using all of them — if you do get close to using that many, we will likely reach out to make sure that everything is OK.

Note that slip days are not designed to help in the case of a serious illness or other unfortunate event that severely disrupts your ability to participate in the class. If something like that should arise, please let us know ASAP!

Exams

Midterms

There will be two midterm exams:

  • Midterm 01: Tuesday, October 31 (focuses on Lectures 01 — 08)
  • Midterm 02: Thursday, November 30 (focuses on Lectures 09 — 15)

The exams will be held in-person during the regularly-scheduled lecture times.

Final Exam

The final exam for DSC 190 is a "no fault" final split into two sections:

  1. An optional Midterm 01 "Redemption" section focusing on Lectures 01 — 08
  2. An optional Midterm 02 "Redemption" section focusing on Lectures 09 — 15

If your score on the midterm redemption section is higher than your score on the original midterm, it will replace that grade. Getting a lower score on a redemption section cannot hurt you (but it will make us sad). As a consequence, the redemption sections are effectively optional.

Under this policy, a bad performance on an earlier exam can be erased by good performance on the same material in a later exam.

Example: You got an "F" on Midterm 1 and a "B" on Midterm 2. You decide to take only the first redemption section on the final (though you could have taken both), and you receive an "A". Your midterm scores are now "A" and "B".

The redemption exams will be held on the date scheduled by the registrar: Friday, December 15.

Note that the topics from Lectures 16, 17, and 18 are not on any exam. These will instead be tested in the Super Homework.

Grading

We'll be using the following grading scheme:

  • 12.5%: Labs (see Lab Grading above)
  • 30%: Homeworks (lowest dropped)
  • 7.5%: "Super Homework"
  • 25%: Midterm 01 (or Redemption Midterm 01, whichever is larger)
  • 25%: Midterm 02 (or Redemption Midterm 02, whichever is larger)

In a typical quarter, the midterm redemption policy has the same effect as a traditional "curve", therefore replacing the need for one. The standard grading scale (where an A is 93+, A- is 90+, B+ is 87+, etc.) will be used as a starting point, but once all scores are in, we will run a clustering algorithm to automatically find the best cutoffs for each letter grade. These cutoffs can only be lowered. For instance, the threshold for an "A" will never be higher than 93%.

A+ grades are not awarded according to a threshold. Instead, A+'s are awarded to the top 5% of students by overall grade.

Support and Resources

As instructors, our job is to foster an environment where everyone, regardless of identity, feels welcome and is able to focus on learning. If there is something we can do in this mission, or if there is something preventing you from succeeding in the class, please let us know. If you feel uncomfortable speaking with us or are searching for help on a specific concern, there are several campus resources available to you, including:

More generally, if you have any concerns about your ability to focus or succeed in this course, or just need someone to talk to, please contact us ASAP and we'll figure something out.

OSD Exam Accommodations

If you have exam accommodations from the OSD, you should receive an email from the data science program that will ask you to provide your availability for your accommodated exam. The program will then schedule the exam and notify the instructor of its time and location. If you do not receive such an email by the end of the second week of classes, please let us know!

Please be sure to respond to the email from the data science program; if the program does not hear back from you, they will be unable to schedule your accommodated exam.

FAQ

Is this class curved?

In a typical quarter, the midterm redemption policy has the same effect as a traditional "curve", therefore replacing the need for one. The standard grading scale (where an A is 93+, A- is 90+, B+ is 87+, etc.) will be used as a starting point, but once all scores are in, we will run a clustering algorithm to automatically find the best cutoffs for each letter grade. These cutoffs can only be lowered. For instance, the threshold for an "A" will never be higher than 93%.