CIS 6614 meeting -*- Outline -*- * Discussion about Reports your report/paper is part of a two thousand year long conversation aiming to improve our culture and make the world a better place Q: What is science? What is a scientific claim? Science is about finding how the world works, objectively A scientific claim is one that is falsifiable, can be shown to be either true or false based on evidence, not believe, assertions, authority... Example: what causes stomach ulcers? for a long time doctors thought it was diet, e.g., spicy foods, as those may make symptoms worse but it turns out to be caused by a bacterium (Helicobacter pylori) or by taking too much anti-inflammatory medicines (NSAIDs), such as ibuprofen and aspirin ** judging tools and papers *** industry ------------------------------------------ HOW TOOLS/PAPERS ARE JUDGED IN INDUSTRY What do managers/developers look for? ------------------------------------------ ... - does it solve my problem? - evidence of effectivenss - availability - cost (money, time, expertise) *** academia ------------------------------------------ HOW PAPERS ARE JUDGED IN ACADEMIA Readers/reviewers look for: ------------------------------------------ ... - importance of the problem (esp. in CS, less so in math) - validity and evidence of validity - generality - novelty (new ideas, approaches) Q: What does "generality" mean in Computing? That the solution applies to many cases algorithms, approaches, techniques Q: Why is generality important? ** abstract ------------------------------------------ WRITING AN ABSTRACT Why? Model 1: Summary of paper - Context - Problem - Approach - Claims and Results - Benefits Model 2 (4 sentences by Ken Beck): ------------------------------------------ ... it's one of the most read parts of a paper can get people to read the rest (and then hope readers will cite paper or purchase tool) ... 1. Problem 2. Why it's a problem (importance) 3. Startling sentence 4. Implication of starting sentence For the startling sentence, Beck recommends to "find the most amazing thing about the work and write it down". It should grab the reader's attention. It should be something that is falsifiable, that is it could be shown to be either true or false For Model 2, see Kent Beck's part (p. 434-435, esp. p. 435) in: Ralph E. Johnson, Kent Beck, Grady Booch, William Cook, Richard Gabriel, and Rebecca Wirfs-Brock. How to get a paper accepted at OOPSLA (panel). SIGPLAN Not. vol. 28, 10 (Oct. 1, 1993), 429--436. https://doi.org/10.1145/167962.165934 ** attack models ------------------------------------------ WHAT IS AN ATTACK MODEL? Describe: - Assumed capabilites of attacker - What attacker can do, during an attack NOT: - How attack proceeds - What attacker wants to do after attack ------------------------------------------ Q: Why do we want to assume attackers have certain capabilities? So that we can still give some guarantees even if attacker can do those things, it's a way of making a conservative assumption Q: What is the name for what an attacker wants to do after an attack succeeds? A threat So threat modeling notes what attackers want to do ------------------------------------------ WHY SPECIFY AN ATTACK MODEL? So reader can judge: ------------------------------------------ ... - if model is reflects reality - if approach works - (if proof is valid, if applicable) ** describing the approach ** related work ------------------------------------------ FINDING RELATED WORK Resources: scholar.google.com portal.acm.org ieeexplore.ieee.org springerlink.com usenix.org/publications/ library.ucf.edu ------------------------------------------ Q: The resources are good for finding academic works, but how would you find commercial products? just use a web search (google.com) For judging whether to read a paper: - Is it cited a lot? - Does the title sound like it solves our problem? (prob. yes) - Read the abstract, is it solving our problem? - Read the introduction and conclusion: - Is it solving our problem? - Does it provide evidence for its claims? - Does it have advantages vs. what we can claim? - Are there interesting ideas? (usually in approach) - If it's very interesting, read the whole paper in detail ------------------------------------------ WHAT TO LOOK FOR IN RELATED WORKS What problem is being solved? Is it the same problem? What are the differences? Is it available for use? What claims are made for the solution? Is that better than our solution? What are the differences? What are the pros and cons? What approach is used in their solution? Are there any good ideas we can use? How hard was it to implement? ------------------------------------------ ** evaluation ------------------------------------------ EVALUATION: DO MORE THAN ONE! If an engineer says that a tool works in one case, would you use it in your company? ------------------------------------------ ... no, so experimental evidence needs more than one case evaluation a single result is a case study, not an experiment case study: useful for explanations but not for evidence experiments: could be convincing to change behavior e.g., to get a company to spend money (may not be very helpful for explanation) ... for evaluation need: - many cases - statistics ... for comparison of tools need: - to compare many tools - to compare on many different cases - statistics ** example We will work an example of using static analysis, to prevent integer overflows *** topic ------------------------------------------ EXAMPLE TOPIC Problem: preventing integer overflows Wrap-around can cause: - allocation of 0 bytes - logic errors Complications: - implicit coercions e.g., unsigned int to int - sign extension - pointer arithmetic uses different types in C (size_t vs. ptrdiff_t) 4 CWEs: 682: incorrect calculation 190: integer overflow or wraparound 191: integer underflow 192: integer coercion error ------------------------------------------ Note that pointer sizes can change on different systems! *** Approach ------------------------------------------ WHAT APPROACH TO USE? Basic decision: static vs. dynamic What are the pros and cons? So what should be the plan? ------------------------------------------ ... static analysis: - needs to be conservative to be sound ==> has false positives - can lead to guarantees - TCB includes the analysis dynamic analysis - cannot give absolute guarantees (incomplete) - can have no false positives - TCB is the compiler and tests Q: Which will have a smaller trusted computing base? probably dynamic analysis And what is the problem? people getting the code wrong So this would favor the dynamic approach ... use dynamic checking to prevent integer wraparound. *** title This is important as it will be the most read part of your paper and gives the first impression to a (potential) reader Often used in searches ------------------------------------------ TITLE Ideas: - mention key words - describe problem ("Preventing, ...") - include any limiting context - describe kind of solution - fit in claim if possible - use a colon for a subtitle Example: ------------------------------------------ Preventing Integer Overflows Using Dynamic Checking in Java *** Related Work Important to find this early in academic work, to make sure the effort will pay off Important to find tools for solving the problem in industry **** related tools (industry) ------------------------------------------ SEARCHING FOR INDUSTRIAL RELATED WORK Tips: - search for the problem on google - look into compilers (for related languages) ------------------------------------------ try searching for "integer overflow security bugs" on google.com Key related work for the example is SafeInt by LeBlanc and -ftrapv in gcc (from the book 24 sins...) (these solve the same problem for a different language) **** related papers (academia) ------------------------------------------ SEARCHING FOR ACADMIC RELATED WORK Tips: - search for the problem in scholarly engines - search in multiple places ------------------------------------------ try searching for "integer overflow security bugs" in scholar.google.com and in UCF Library, ACM DL, IEEE Xplore, Springerlink.com In scholar, look at what is cited by interesting papers Save the papers somewhere... Q: Should we ignore related work that uses a different approach? No! It's related if it solves the same problem! But can lump works together by approach in a discussion Key related work for the example is SafeInt by LeBlanc and -ftrapv in gcc *** Describing the Approach (for tool builders) ------------------------------------------ DESCRIBING THE APPROACH Key questions: - what would a CS grad student need to know to work on this? Describe: - overall approach in technical term e.g., static or dynamic analysis - key decisions - user interface or API - modules/components of software use of existing tools - architecture, how components connect - key data structures and algorithms ------------------------------------------ ------------------------------------------ EXAMPLE ------------------------------------------ ... Decisions: - overall approach: dynamic analysis - Class for safe integer math, have methods all throw a checked exception whenever there is wraparound - Need one class for each type of Java integer: short, int, long Would it be better to pass in signed/unsigned and limit(s)? - static (class) or instance (object) methods? will do both - data structures and math not that interesting in this case *** Evaluation ------------------------------------------ EVALUATION For tool comparisons: case studies showing: - download (availability) - install difficulty - ease of use (for extremes) experiments showing: - effectiveness (for real problems) - amount of imprecision - cost For tools: experiments showing: - effectiveness - amount of imprecision (false positives, false negatives) - (performance) case studies: - showing utility - helping explain ideas ------------------------------------------ For theory work, would have theorems and proofs ------------------------------------------ DESIGNING EXPERIMENTS Planning: - What are the possible outcomes? - What will those tell us? Need: ------------------------------------------ ... - variety of experiments (not just one) e.g., addition, subtraction, multiplication ideally inspired by problem and importance e.g., allocation of array - both positive and negative tests *** Related Work ------------------------------------------ RELATED WORK What solves the same problem? Lump any of it together? What advantages/disadvantages vs. related? ------------------------------------------ discuss the related work found earlier