CIS 6614 meeting -*- Outline -*-

* Discussion about Reports
    your report/paper is part of a
       two thousand year long conversation
       aiming to improve our culture and make the world a better place

    Q: What is science? What is a scientific claim?
       Science is about finding how the world works, objectively
       A scientific claim is one that is falsifiable,
          can be shown to be either true or false
          based on evidence, not believe, assertions, authority...

       Example: what causes stomach ulcers?
          for a long time doctors thought it was diet,
              e.g., spicy foods, as those may make symptoms worse
          but it turns out to be caused by a bacterium (Helicobacter pylori)
            or by taking too much anti-inflammatory medicines
              (NSAIDs), such as ibuprofen and aspirin

** judging tools and papers
*** industry
------------------------------------------
 HOW TOOLS/PAPERS ARE JUDGED IN INDUSTRY

What do managers/developers look for?


------------------------------------------
     ... - does it solve my problem?
         - evidence of effectivenss
         - availability
         - cost (money, time, expertise)


*** academia
------------------------------------------
     HOW PAPERS ARE JUDGED IN ACADEMIA

Readers/reviewers look for:


------------------------------------------
      ... - importance of the problem
             (esp. in CS, less so in math)
          - validity and evidence of validity
          - generality
          - novelty (new ideas, approaches)

      Q: What does "generality" mean in Computing?
         That the solution applies to many cases
            algorithms, approaches, techniques

      Q: Why is generality important?
      
** abstract
------------------------------------------
         WRITING AN ABSTRACT

Why?


Model 1:
    Summary of paper
      - Context
      - Problem
      - Approach
      - Claims and Results
      - Benefits

Model 2 (4 sentences by Ken Beck):


------------------------------------------
        ... it's one of the most read parts of a paper
            can get people to read the rest
                (and then hope readers will cite paper or purchase tool)


        ... 1. Problem
            2. Why it's a problem (importance)
            3. Startling sentence
            4. Implication of starting sentence

        For the startling sentence, Beck recommends to
        "find the most amazing thing about the work and write it down".
        It should grab the reader's attention.
        It should be something that is falsifiable,
          that is it could be shown to be either true or false

        For Model 2, see Kent Beck's part (p. 434-435, esp. p. 435) in:
        Ralph E. Johnson, Kent Beck, Grady Booch, William Cook,
        Richard Gabriel, and Rebecca Wirfs-Brock.
        How to get a paper accepted at OOPSLA (panel).
        SIGPLAN Not. vol. 28, 10 (Oct. 1, 1993), 429--436.
        https://doi.org/10.1145/167962.165934


** attack models
------------------------------------------
        WHAT IS AN ATTACK MODEL?

Describe:
 - Assumed capabilites of attacker
 - What attacker can do, during an attack


NOT:
 - How attack proceeds
 - What attacker wants to do after attack


------------------------------------------
        Q: Why do we want to assume attackers have certain capabilities?
           So that we can still give some guarantees
              even if attacker can do those things,
              it's a way of making a conservative assumption

        Q: What is the name for what an attacker wants to do
            after an attack succeeds?
            A threat
            So threat modeling notes what attackers want to do

------------------------------------------
     WHY SPECIFY AN ATTACK MODEL?


So reader can judge:


------------------------------------------
        ... - if model is reflects reality
            - if approach works
            - (if proof is valid, if applicable)

** describing the approach

** related work
------------------------------------------
         FINDING RELATED WORK

Resources:
    scholar.google.com
    portal.acm.org
    ieeexplore.ieee.org
    springerlink.com
    usenix.org/publications/
    library.ucf.edu


------------------------------------------
        Q: The resources are good for finding academic works,
           but how would you find commercial products?
           just use a web search (google.com)

        For judging whether to read a paper:
            - Is it cited a lot?
            - Does the title sound like it solves our problem? (prob. yes)
            - Read the abstract, is it solving our problem?
            - Read the introduction and conclusion:
                  - Is it solving our problem?
                  - Does it provide evidence for its claims?
                  - Does it have advantages vs. what we can claim?
                  - Are there interesting ideas? (usually in approach)
            - If it's very interesting, read the whole paper in detail

------------------------------------------
    WHAT TO LOOK FOR IN RELATED WORKS

What problem is being solved?

   Is it the same problem?
      What are the differences?

Is it available for use?

What claims are made for the solution?

   Is that better than our solution?
      What are the differences?
      What are the pros and cons?

What approach is used in their solution?

  Are there any good ideas we can use?
  How hard was it to implement?

------------------------------------------


** evaluation
------------------------------------------
        EVALUATION: DO MORE THAN ONE!

If an engineer says that a tool works in
   one case,
   would you use it in your company?


------------------------------------------
        ... no, so experimental evidence needs more than one case
            evaluation

            a single result is a case study, not an experiment

            case study: useful for explanations
                        but not for evidence

            experiments: could be convincing to change behavior
                             e.g., to get a company to spend money
                         (may not be very helpful for explanation)

         ... for evaluation need:
               - many cases
               - statistics

         ... for comparison of tools need:

               - to compare many tools
               - to compare on many different cases
               - statistics

** example
      We will work an example of using static analysis,
         to prevent integer overflows

*** topic
------------------------------------------
          EXAMPLE TOPIC

Problem: preventing integer overflows

Wrap-around can cause:
 - allocation of 0 bytes
 - logic errors

Complications:
 - implicit coercions
     e.g., unsigned int to int
 - sign extension
 - pointer arithmetic uses different types
    in C (size_t vs. ptrdiff_t)

4 CWEs:
  682: incorrect calculation
  190: integer overflow or wraparound
  191: integer underflow
  192: integer coercion error
   
------------------------------------------

        Note that pointer sizes can change on different systems!

*** Approach
------------------------------------------
          WHAT APPROACH TO USE?

Basic decision: static vs. dynamic

What are the pros and cons?


So what should be the plan?

------------------------------------------

        ... static analysis:
             - needs to be conservative to be sound
                   ==> has false positives
             - can lead to guarantees
             - TCB includes the analysis

            dynamic analysis
             - cannot give absolute guarantees (incomplete)
             - can have no false positives
             - TCB is the compiler and tests

            Q: Which will have a smaller trusted computing base?
               probably dynamic analysis

               And what is the problem?
                 people getting the code wrong

               So this would favor the dynamic approach

           ... use dynamic checking to prevent integer wraparound.

*** title
       This is important as it will be the most read part
          of your paper
          and gives the first impression to a (potential) reader

       Often used in searches
       
------------------------------------------
              TITLE

Ideas:

 - mention key words
 - describe problem ("Preventing, ...")
 - include any limiting context
 - describe kind of solution
 - fit in claim if possible
 - use a colon for a subtitle

Example:


------------------------------------------

        Preventing Integer Overflows Using Dynamic Checking in Java

*** Related Work
      Important to find this early in academic work,
         to make sure the effort will pay off
      Important to find tools for solving the problem in industry

**** related tools (industry)
------------------------------------------
  SEARCHING FOR INDUSTRIAL RELATED WORK

Tips:

 - search for the problem on google
 - look into compilers
     (for related languages)
     

------------------------------------------
       try searching for "integer overflow security bugs" on google.com
       
       Key related work for the example is SafeInt by LeBlanc
           and -ftrapv in gcc  (from the book 24 sins...)
           (these solve the same problem for a different language)

**** related papers (academia)
------------------------------------------
    SEARCHING FOR ACADMIC RELATED WORK


Tips:

 - search for the problem
     in scholarly engines
 - search in multiple places


------------------------------------------

        try searching for "integer overflow security bugs"
           in scholar.google.com
           and in UCF Library, ACM DL, IEEE Xplore, Springerlink.com

           In scholar, look at what is cited by interesting papers

           Save the papers somewhere...

       Q: Should we ignore related work that uses a different approach?
          No!  It's related if it solves the same problem!
               But can lump works together by approach in a discussion

       Key related work for the example is SafeInt by LeBlanc
           and -ftrapv in gcc

*** Describing the Approach (for tool builders)
------------------------------------------
      DESCRIBING THE APPROACH

Key questions:
  - what would a CS grad student
    need to know to work on this?


Describe:
   - overall approach in technical term
       e.g., static or dynamic analysis

   - key decisions

   - user interface or API

   - modules/components of software
       use of existing tools

   - architecture, how components connect

   - key data structures and algorithms
   
------------------------------------------

------------------------------------------
            EXAMPLE


------------------------------------------

        ...
         Decisions:
            - overall approach: dynamic analysis
            
            - Class for safe integer math,
               have methods all
               throw a checked exception whenever there is wraparound

            - Need one class for each type of Java integer:
                 short, int, long

              Would it be better to pass in signed/unsigned and limit(s)?

            - static (class) or instance (object) methods?
                will do both

            - data structures and math not that interesting in this
              case

*** Evaluation
------------------------------------------
             EVALUATION

For tool comparisons:
  case studies showing:
    - download (availability)
    - install difficulty
    - ease of use (for extremes)

  experiments showing:
    - effectiveness (for real problems)
    - amount of imprecision
    - cost

For tools:
  experiments showing:
    - effectiveness
    - amount of imprecision
      (false positives, false negatives)
    - (performance)

  case studies:
    - showing utility
    - helping explain ideas

------------------------------------------

    For theory work, would have theorems and proofs

------------------------------------------
          DESIGNING EXPERIMENTS

Planning:
 - What are the possible outcomes?
 - What will those tell us?

Need:


------------------------------------------
      ... - variety of experiments
             (not just one)
             e.g., addition, subtraction, multiplication
             ideally inspired by problem and importance
                 e.g., allocation of array

          - both positive and negative tests
        

*** Related Work
------------------------------------------
          RELATED WORK

What solves the same problem?


Lump any of it together?


What advantages/disadvantages vs. related?


------------------------------------------

    discuss the related work found earlier