Skip to content

How to Do Research

I am interested in Artificial Intelligence, Neuroscience, Cognitive Science, Psychology and Education. Although my major is Computer Science, I stay hungry, foolish and curious about many areas. Mathematics and Statistics can help us abstract things better, and Philosophy can help us understand the essence behind it. We need to maintain awe of knowledge as researchers.1 (Footnote gives an illustrated article written by Matt Might)

Motto

"Be curious. Read widely. Try new things. What people call intelligence just boils down to curiosity." -- Aaron Swartz

🎁 Andrew Ng on Building a Career in Machine Learning

What does it take to build your skills and successfully break into AI? In this ACM webinar on December 4, Andrew gives some tips on how to approach building your career in AI and answers questions from the audience.2 Here are my notes for part of the video:

Building a Career in Machine Learning

  • Area(horizontal): Machine Learning / Neural Networks / Graph Models / Data Science / Writing Codes / ... Coursework + Reading Skills

  • Depth(vertical): Work on Open Source Projects. Do Know How to Build.

  • Two more things: 1.Dirty work; 2.Life-long learning.

Recruiters look for

  • Skills (ML quiz, coding)
  • Meaningful projects
    • Can actually do work

Watch more similar videos on YouTube:

🎯 Life-long Learning

I keep the habit of reading for at least two hours a day. I used to read Chinese classics and now switch into English literature. Now it takes more time to write code to reproduce some experiment results or implement some ideas. It turns that you can learn new things by reading, coding and other forms but keep Life-long Learning. And I have benefited from Online Education such as the MOOC website Coursera.

Find efficient reading time

Different people have different efficiency in one day.

📰 Reading research papers

  1. Compile list of papers 📁 (Arixiv/Medium/Blog Posts/etc.)
  2. Skip around the list 👁️‍🗨️ and mark 0~100% read/understood. For example, you start to read 10 papers in order, and find that:
    • ❎ Paper 2 is dud -- something wrong/make no sense, skip this paper;
    • ✅ Paper 3 is seminal -- spend more time to read and understood;
    • ↩ Paper 6 in Paper 4's citations -- go back and fresh understanding Paper 4;
  3. Some rough guidelines:
    • 5~20 papers: good enough to do some work and aplly some algorithms;
    • 50~100 papers: give a very good understanding of an area.

👍 Top Conference

Recommend abhshkdz/ai-deadlines. It shows countdowns to top CV/NLP/ML/Robotics/AI conference deadlines. You can find some reading lists or road-maps searching on GitHub.

Follow good even top researchers

  • Not just only on Twitter (Homepage/Blog/Article/Google Scholar...)
  • Your group can share attractive works online.
  • Famous tech teams: Google Brain / DeepMind / OpenAI / ...

📃 How to Read a Paper

S. Keshav used THE THREE-PASS APPROACH3 to efficiently read papers. The key idea is that you should read the paper in up to three passes, instead of starting at the beginning and plowing your way to the end. Each pass accomplishes specific goals and builds upon the previous pass: The first pass gives you a general idea about the paper. The second pass lets you grasp the paper’s content, but not its details. The third pass helps you understand the paper in depth.

If English is not your native language, you may need more time.

1️⃣ The first pass

The first pass is a quick scan to get a bird’s-eye view of the paper. You can also decide whether you need to do any more passes. This pass should take about five to ten minutes and consists of the following steps:

  1. Carefully read the title, abstract, figures , and introduction
  2. Read the section and sub-section headings, but ignore everything else
  3. Read the conclusions

  4. Glance over the references, mentally ticking off the ones you’ve already read

At the end of the first pass, you should be able to answer the five Cs:

  1. Category: What type of paper is this? A measurement paper? An analysis of an existing system? A description of a research prototype?
  2. Context: Which other papers is it related to? Which theoretical bases were used to analyze the problem?
  3. Correctness: Do the assumptions appear to be valid?
  4. Contributions: What are the paper’s main contributions?
  5. Clarity: Is the paper well written?

Using this information, you may choose not to read further. This could be because the paper doesn’t interest you, or you don’t know enough about the area to understand the paper, or that the authors make invalid assumptions. The first pass is adequate for papers that aren’t in your research area, but may someday prove relevant.

Incidentally, when you write a paper, you can expect most reviewers (and readers) to make only one pass over it. Take care to choose coherent section and sub-section titles and to write concise and comprehensive abstracts. If a reviewer cannot understand the gist after one pass, the paper will likely be rejected; if a reader cannot understand the highlights of the paper after five minutes, the paper will likely never be read.

2️⃣ The second pass

In the second pass, read the paper with greater care, but ignore details such as proofs. It helps to jot down the key points, or to make comments in the margins, as you read.

  1. Look carefully at the figures, diagrams and other illustrations in the paper. Pay special attention to graphs. Are the axes properly labeled? Are results shown with error bars, so that conclusions are statistically significant? Common mistakes like these will separate rushed, shoddy work from the truly excellent. Skip/skim the math.
  2. Remember to mark relevant unread references for further reading (this is a good way to learn more about the background of the paper).

The second pass should take up to an hour. After this pass, you should be able to grasp the content of the paper. You should be able to summarize the main thrust of the paper, with supporting evidence, to someone else. This level of detail is appropriate for a paper in which you are interested, but does not lie in your research speciality.

Sometimes you won’t understand a paper even at the end of the second pass. This may be because the subject matter is new to you, with unfamiliar terminology and acronyms. Or the authors may use a proof or experimental technique that you don’t understand, so that the bulk of the paper is incomprehensible. The paper may be poorly written with unsubstantiated assertions and numerous forward references. Or it could just be that it’s late at night and you’re tired. You can now choose to: (a) set the paper aside, hoping you don’t need to understand the material to be successful in your career, (b) return to the paper later, perhaps after reading background material or © persevere and go on to the third pass

3️⃣ The third pass

To fully understand a paper, particularly if you are reviewer, requires a third pass. The key to the third pass is to attempt to virtually re-implement the paper: that is, making the same assumptions as the authors, re-create the work. By comparing this re-creation with the actual paper, you can easily identify not only a paper’s innovations, but also its hidden failings and assumptions.

This pass requires great attention to detail. You should identify and challenge every assumption in every statement. Moreover, you should think about how you yourself would present a particular idea. This comparison of the actual with the virtual lends a sharp insight into the proof and presentation techniques in the paper and you can very likely add this to your repertoire of tools. During this pass, you should also jot down ideas for future work.

This pass can take about four or five hours for beginners, and about an hour for an experienced reader. At the end of this pass, you should be able to reconstruct the entire structure of the paper from memory, as well as be able to identify its strong and weak points. In particular, you should be able to pinpoint implicit assumptions, missing citations to relevant work, and potential issues with experimental or analytical techniques.

🔓 Understand Paper Deeply

You can also find some blog posts and forums through search engines that discuss and explain related papers. The best way to self-test concepts in paper is to give a presentation to members of the lab research team. If you can easily respond to any question about details, that means you understand the paper very well.

Math

Re-derive from scratch. (Heavy-math4 ?)

Code

  1. Run open-source code. (Check the result)
  2. Re-implement from scratch. (Learn a lot of tricks)

📔 DOING A LITERATURE SURVEY

Paper reading skills are put to the test in doing a literature survey. This will require you to read tens of papers, perhaps in an unfamiliar field. What papers should you read? Here is how you can use the three-pass approach to help.

  • First, use an academic search engine such as Google Scholar or CiteSeer and some well-chosen keywords to find three to five recent papers in the area. Do one pass on each paper to get a sense of the work, then read their related work sections. You will find a thumbnail summary of the recent work, and perhaps, if you are lucky, a pointer to a recent survey paper. If you can find such a survey, you are done. Read the survey, congratulating yourself on your good luck.

  • Otherwise, in the second step, find shared citations and repeated author names in the bibliography. These are the key papers and researchers in that area. Download the key papers and set them aside. Then go to the websites of the key researchers and see where they’ve published recently. That will help you identify the top conferences in that field because the best researchers usually publish in the top conferences.

  • The third step is to go to the website for these top conferences and look through their recent proceedings. A quick scan will usually identify recent high-quality related work. These papers, along with the ones you set aside earlier, constitute the first version of your survey. Make two passes through these papers. If they all cite a key paper that you did not find earlier, obtain and read it, iterating as necessary

🧠 A Survival Guide to a PhD

Andrej Karpathy

From Andrej Karpathy's blog.


Ask yourself if you find the following properties appealing:

Freedom. A PhD will offer you a lot of freedom in the topics you wish to pursue and learn about. You’re in charge. Of course, you’ll have an adviser who will impose some constraints but in general you’ll have much more freedom than you might find elsewhere.

Ownership. The research you produce will be yours as an individual. Your accomplishments will have your name attached to them. In contrast, it is much more common to “blend in” inside a larger company. A common feeling here is becoming a “cog in a wheel”.

Exclusivity. There are very few people who make it to the top PhD programs. You’d be joining a group of a few hundred distinguished individuals in contrast to a few tens of thousands (?) that will join some company.

Status. Regardless of whether it should be or not, working towards and eventually getting a PhD degree is culturally revered and recognized as an impressive achievement. You also get to be a Doctor; that’s awesome.

Personal freedom. As a PhD student you’re your own boss. Want to sleep in today? Sure. Want to skip a day and go on a vacation? Sure. All that matters is your final output and no one will force you to clock in from 9am to 5pm. Of course, some advisers might be more or less flexible about it and some companies might be as well, but it’s a true first order statement.

Maximizing future choice. Joining a PhD program doesn’t close any doors or eliminate future employment/lifestyle options. You can go one way (PhD -> anywhere else) but not the other (anywhere else -> PhD -> academia/research; it is statistically less likely). Additionally (although this might be quite specific to applied ML), you’re strictly more hirable as a PhD graduate or even as a PhD dropout and many companies might be willing to put you in a more interesting position or with a higher starting salary. More generally, maximizing choice for the future you is a good heuristic to follow.

Maximizing variance. You’re young and there’s really no need to rush. Once you graduate from a PhD you can spend the next ~50 years of your life in some company. Opt for more variance in your experiences.

Personal growth. PhD is an intense experience of rapid growth (you learn a lot) and personal self-discovery (you’ll become a master of managing your own psychology). PhD programs (especially if you can make it into a good one) also offer a high density of exceptionally bright people who will become your best friends forever.

Expertise. PhD is probably your only opportunity in life to really drill deep into a topic and become a recognized leading expert in the world at something. You’re exploring the edge of our knowledge as a species, without the burden of lesser distractions or constraints. There’s something beautiful about that and if you disagree, it could be a sign that PhD is not for you.

The disclaimer. I wanted to also add a few words on some of the potential downsides and failure modes. The PhD is a very specific kind of experience that deserves a large disclaimer. You will inevitably find yourself working very hard (especially before paper deadlines). You need to be okay with the suffering and have enough mental stamina and determination to deal with the pressure. At some points you will lose track of what day of the week it is and go on a diet of leftover food from the microkitchens. You’ll sit exhausted and alone in the lab on a beautiful, sunny Saturday scrolling through Facebook pictures of your friends having fun on exotic trips, paid for by their 5-10x larger salaries. You will have to throw away 3 months of your work while somehow keeping your mental health intact. You’ll struggle with the realization that months of your work were spent on a paper with a few citations while your friends do exciting startups with TechCrunch articles or push products to millions of people. You’ll experience identity crises during which you’ll question your life decisions and wonder what you’re doing with some of the best years of your life. As a result, you should be quite certain that you can thrive in an unstructured environment in the pursuit research and discovery for science. If you’re unsure you should lean slightly negative by default. Ideally you should consider getting a taste of research as an undergraduate on a summer research program before before you decide to commit. In fact, one of the primary reasons that research experience is so desirable during the PhD hiring process is not the research itself, but the fact that the student is more likely to know what they’re getting themselves into.

I should clarify explicitly that this post is not about convincing anyone to do a PhD, I’ve merely tried to enumerate some of the common considerations above. The majority of this post focuses on some tips/tricks for navigating the experience once if you decide to go for it (which we’ll see shortly, below).

Lastly, as a random thought I heard it said that you should only do a PhD if you want to go into academia. In light of all of the above I’d argue that a PhD has strong intrinsic value - it’s an end by itself, not just a means to some end (e.g. academic job).


Visit his post5 for the full text including:

  • Adviser
  • Research topics
  • Writing papers
  • Writing code
  • Giving talks
  • Attending conferences
  • Closing thoughts

💡 Twenty Things to Remember

Lucy A. Taylor

Recent PhD graduate Lucy A. Taylor shares the advice she and her colleagues wish they had received.6 Lucy Taylor received her PhD from the University of Oxford, UK, in 2018.


  1. Matt Might. The illustrated guide to a Ph.D. [url

  2. Deeplearing.ai. ACM Webinar: Q&A with Andrew Ng on Building a Career in Machine Learning. [url

  3. S. Keshav. How to Read a Paper. [url

  4. Schmook. How do you read math-heavy machine learning papers?. [url

  5. Andrej Karpathy. A Survival Guide to a PhD. [url

  6. Lucy Taylor. Twenty things I wish I’d known when I started my PhD. Nature Careers Community. [url