Jdavis Communicationskills
Jdavis Communicationskills
Jdavis Communicationskills
Jesse Davis
Dept. of Computer Science Katholieke Universiteit Leuven [and Ingo Thon]
Motivation
PhD students
Share ideas by writing papers and giving presentations Must be able to effectively communicate all your great work
PART I: WRITING
Motivation
First way your work is introduced to broader research community is a paper submission Reviewers have very little time Easy to reject unclear, poorly written papers You need publications to graduate
Outline
Fundamentals of writing Structuring a paper General advice and common errors Exercises
Writing Commandments
START WRITING EARLY Writing should be
Writing takes a long time Papers must be (re)written many times Hard to write, even for native speakers Promoters are busy and often cannot read papers right away Be organized!
Does a sentence make sense? Can you understand it based on its context? Does it clearly convey your point? Does it contain all the necessary information?
Write Concisely
Do you need to every word in the sentence? Is there a shorter way to express your point? Example:
Wordy: Because one customer has bought a pair of books together, every other customer that is interested in one of these books is not necessarily also interested in the other one. Concise: Just because one customer bought a pair of books does not imply that every other customer that buys one of these books will buy the other book.
Non-Redundant
The papers main message should be repeated, but that is it Look at paragraph/sections and see if the same information appears more than once Ties back to being concise
BE CONSISTENT
E.g.: Do not switch between statistical relational learning, probabilistic learning, etc. E.g., italicize a definition E.g., different fonts for formulas
BE CONSISTENT
E.g.: Do not switch between statistical relational learning, probabilistic learning, etc. E.g., italicize a definition E.g., different fonts for formulas
Start writing early Break up the text with figures/lists/different environments Rewrite liberally: I go through many many iterations before Im happy Show your writing to lots of people
Things clear to you arent to others People spot errors that you cant
Outline
Fundamentals of writing Structuring a paper General advice and common errors Exercises
Paper Structure
Introduction Background Algorithm Empirical evaluation Related work Conclusions and future work
What is the problem you are solving? Why is it important and challenging? What is your solution to the problem
How did you solve the problem How is your solution different What is the value added of your approach
Describe your solution to the problem Provide enough details to redo your work Justify the choices you made Any theoretical guarantees?
Set
up questions you want to answer What experiments can confirm your hypothesis What the appropriate baseline algorithms Note evaluation can be theoretical too
What is the key difference in your approach What are the strengths/weakness of your approach vs. prior approaches
Outline
Fundamentals of writing Structuring a paper General advice and common errors [Look at Marie Des Jardins Web page] Exercises
General Rules
Avoid contractions
Spell out numbers less than or equal to ten Avoid transitioning from a section directly to a subsection without intervening text Avoid a section that has only one subsection Avoid colloquial English or slang
Bad:
Good:
The following example is used to illustrate The algorithm is supposed The full state is generated from a set of variables Only temporal sequences are considered
Think about word choice, use dictionary and thesaurus if needed Example: Exact inference in hybrid models is prohibitively slow so a learning method using exact estimations of the marginal distributions is prohibitive. Note: This cannot always be avoided
Common Errors
i.e., means that is and must be written i.e., e.g., means for example and must be written e.g., Et al. Footnotes go after periods .\footnote{} Punctuation goes inside quotes
Common Errors
Bad: (Davis et al., 2005) shows Bad: [17] shows Good: Davis et al. (2005) shows
Between is used for only two entities Among is used for more than two entities
Outline
Fundamentals of writing Structuring a paper General advice and common errors Exercises
I ____ myself how to play the piano I ____ how to play the piano I ____ about machine learning in class Luc ____ me about probabilistic models in class Luc ____ probabilistic models in his lecture
Compound Adjectives
Put hyphens in underlined words if needed The real world is complex. We evaluated our algorithm on three real world domains. The state of the art approach is based on The first order of business First order logic
The key points are illustrated in Figure 1. The company is based in Portland and designs microprocessors. The decisions that are selected should maximize the expected utility. Variables are used as placeholders for specific entities. The family of probabilistic programming languages considered in this thesis are based on logic programming.
Traditional machine learning techniques expects data to come in the form of feature vectors. A feature vector describes the training example by a set of features. The second skill is the ability to decide what to do, and is concerned with deciding which actions to take.
Make Parallel
This learning algorithm is closely related to the one for CPT-L. First of all it provides a motivation for the propositional logical formula generated by CPT-L, which can be seen as specialization of Clarks completion. Furthermore, it generalizes EM by means of the BDD algorithm of CPT-L. The generalization allows for hidden and deterministic variables. Third, as in CPT-L it splits sequences into transitions; LFI-ProbLog exploits certain Independence to split training examples. In fact, LFIProbLog, when applied to translated CPT-L, would automatically rediscover its learning algorithm. Finally, very similar to the partial lifted algorithm of CPT-L, LFI-Problog provides a probabilistic version of unit propagation.
Assignment
Write a maximum of two page, double spaced text that
Clearly defines your research problem Briefly states how you will tackle it
Motivation
Most people to have time to read every paper Listening to talks keeps them up-to-date Great way to convince people that your research is relevant and important
Outline
General advice for presentations Improving readability Using pictures Presenting results Tips for making slides
Goals of Presenting
Goal is to give big picture, not all the details Think about: What is the take away message
Think about the structure of the talk Make sure to introduce all relevant background knowledge/terms Think about how to best convey ideas Making good slides takes a long time
Structuring a Presentation
General structure Motivation Background Algorithm Results Conclusions
Structuring a Presentation
Like writing, want to convey
What is the problem Why is it important How did you solve it What results did you get
Be sure to clearly define the problem you are trying to solve A good strategy is Given and Do Another good strategy is to show a picture
Learning Task
Cancer
Mass Size
Given: Dataset, initial set of features Do: Automatically invent new features, relations and learn statistical model
Learning Task
Cancer Increase in Size
Mass Size
Given: Dataset, initial set of features Do: Automatically invent new features, relations and learn statistical model
Prescriptions
PID Prescribed P1 5/17/98 Medication Dose Duration prilosec 10mg 3 months
Diseases
PID Date Symptoms Diagnosis P1 1/1/01 palpitations hypoglycemic P1 2/1/03 fever, aches influenza
Lab Tests
PID P1 P1 Date Lab Test Result 42 45 1/1/01 blood glucose 1/9/02 blood glucose
Goal: Predict at prescription time if a patient will have an adverse reaction to a medicine
Outline
General advice for presentations Improving readability Using pictures Presenting results Tips for making slides
Spaces between points is helpful It makes things easier to read for a listener Color highlights important points
Complex and uncertain data Implicitly defined features and relations Train and test data come from different distributions Invent predicates/features: More accurate learned models Abstract predicates: Reuse knowledge across domains
Complex and uncertain data Implicitly defined features and relations Train and test data come from different distributions
Invent predicates/features: More accurate learned models Abstract predicates: Reuse knowledge across domains
First-Order Logic
Constants, variables, predicates, functions E.g.: Anna, x, Friends(x,y), MotherOf(x) Literal: Predicate or its negation Clause: E.g.: Friends(x,y) Friends(y,z) Friends (x,z) Definite Clause: E.g.: Friends(x,y) Friends(y,z) Friends (x,z)
First-Order Logic
Constants, variables, predicates, functions E.g.: Anna, x, Friends(x,y), MotherOf(x) Literal: Predicate or its negation Clause: E.g.: Friends(x,y) Friends(y,z) Friends (x,z) Definite Clause: E.g.: Friends(x,y) Friends(y,z) Friends (x,z)
Dark on dark Light on light Think about people who are color-blind
Try to use at least 24 point font Think about people in the back of the room Make sure top, bottom, sides not cut off
Outline
General advice for presentations Improving readability Using pictures Presenting results Tips for making slides
Proposition methods treat each example independently Predict label for each example using only the attribute-values for that example
1 2 3 4
P1 P1 P1 P2
? ? ? ?
1 2 3 4
P1 P1 P1 P2
? ? ? ?
1 2 3 4
P1 P1 P1 P2
No ? ? ?
1 2 3 4
P1 P1 P1 P2
No ? ? ?
1 2 3 4
P1 P1 P1 P2
No ? ? ?
Bayesian Networks
[Pearl, 1988]
Each node x has a conditional probability distribution: Prob(x | Parents(x)) Encodes the following distribution
Bayesian Networks
[Pearl, 1988]
Cancer
Mass Size
Joint Distribution:
Bayesian Networks
[Pearl, 1988]
Mass Size
Joint Distribution:
Bayesian Networks
[Pearl, 1988]
Mass Size
Joint Distribution:
Bayesian Networks
[Pearl, 1988]
Each variable is a node Arcs capture dependences Conditional probability distribution P(Present | No) = .01 P(Present | Yes) = .55 Cancer
Mass Size
Joint Distribution:
Given:
Learn: A set of clauses that when combined with the background knowledge
SAYU-VISTA Algorithm
Feature construction Model induction Proposes new feature as first-order definite clauses Introduce feature as binary variable in statistical model If including feature improves score, keep it
Iterative procedure:
SAYU-VISTA Algorithm
Class Value
Score = 0.02 =
Feat 1
Feat N
Background Knowledge
SAYU-VISTA Algorithm
Class Value
Feat 1
Feat N
Pred 1
Background Knowledge
SAYU-VISTA Algorithm
Class Value
Feat 1
Feat N
Pred 1
Background Knowledge
SAYU-VISTA Algorithm
Class Value
Feat 1
Feat N
Pred 1
Pred 2
Background Knowledge
SAYU-VISTA Algorithm
Class Value
Score = =
Feat 1
Feat N
Pred 1
P2(patient)
Background Knowledge
SAYU-VISTA Algorithm
Class Value
Feat 1
Feat N
Pred 1
Pred 3
HistoryOfBC(patient) P2(patient)
Background Knowledge
SAYU-VISTA Algorithm
Class Value
Feat 1
Feat N
Pred 1
Pred 3
Background Knowledge
SAYU-VISTA Algorithm
Class Value
Feat 1
Feat N
Pred 1
Pred 3
Background Knowledge
Outline
General advice for presentations Improving readability Using pictures Presenting results Tips for making slides
Presenting Results
What is the point of the experiments? What datasets were used? What are the relevant metrics? What are the relevant baselines?
Presenting Results
Precision
Recall
Label axes Describe axes and what plot shows Describe what good/best results looks like
Presenting results
Outline
General advice for presentations Improving readability Using pictures Presenting results Tips for making slides
This is pointless
Yada yada
Good
Proposes
new feature as first-order definite clauses Introduce feature as binary variable in statistical model
Assignment!
Prepare a 10 minute presentation Teach us something It can be anything that is interesting to you Goal is to start practicing good presenting practices
Presentation Wrap Up
Avoid the following pit falls: Lacking a take away message
Not
practicing the presentation enough too much text on the slides insufficient time for making slides
Including
Allocating
Writing Speaking