Matt Coles před 8 roky
revize
4f76cd056e
5 změnil soubory, kde provedl 123 přidání a 0 odebrání
  1. 21 0
      compwocont.md
  2. 27 0
      energysched.md
  3. 19 0
      funcmemo.md
  4. 27 0
      manmem.md
  5. 29 0
      webasm.md

+ 21 - 0
compwocont.md

@@ -0,0 +1,21 @@
1
+# Summarise Contribution & Motivation
2
+
3
+- Both CPS and ANF inhibit some optimisations
4
+- Aim to get benefits of both
5
+- Extend direct-style lambda calculus by adding join points
6
+- How to infer join points
7
+- Join points can be recursive which gives new optimisation
8
+- Show that approach works in GHC
9
+
10
+# Methodology
11
+
12
+- Define new intermediate language Fj
13
+- Perform single pass to find let bindings that are join points
14
+- These are let bound functions that will never be captured in closure
15
+- This is simpler than contification because they only look for tail calls
16
+
17
+# Critical Assessment
18
+
19
+- Achieves big reduction in allocations for some programs
20
+- Other programs lose out on potential optimisations
21
+- Transformations can be applied broadly, even to call by value languages

+ 27 - 0
energysched.md

@@ -0,0 +1,27 @@
1
+# Summarise Contribution & Motivation
2
+
3
+- Energy efficiency is a primary design goal for all systems
4
+- Power management not exposed to end user
5
+- This makes it difficult to design general purpose runtime to schedule work between the CPU and GPU
6
+- Black box approach is used where power model is computed once for each processor
7
+- This is done because users don't have access to DVFS (dynamic voltage and frequency scaling)
8
+- Distributes work across CPU and GPU in such a way to optimise for power metric
9
+
10
+# Methodology
11
+
12
+- Automatically characterising behaviour is non trivial.
13
+- Measure cpu cores, gpu cores, ring interconnect and LLC = package power
14
+- Probed processor power use under various workloads to model power consuptions
15
+- Used microbenchmarks to get power characterisations
16
+- Application is profiled too to determine memory or compute bound
17
+- Use a mathematical algorithm to produce gpu offload ratio to minimise a given metric
18
+- To implement, they use Concord framework, heterogeneous C++ but algorithm is not tied
19
+- Work stealing from CPU
20
+- Ran a number of different benchmarks to evaluate performance
21
+
22
+# Critical Assessment
23
+
24
+- Does well by using black-box approach
25
+- Other solutions relied on static analysis or existing knowledge of processor
26
+- Their solution is run at user level and doesn't require controlling of DVFS
27
+- Limited to applications that benefit from offloading work to GPU

+ 19 - 0
funcmemo.md

@@ -0,0 +1,19 @@
1
+# Summarise Contribution & Motivation
2
+
3
+- Build on existing work of function memoization
4
+- Extend the scope by doing it at compile time allowing user defined functions instead of dynamic linked
5
+- Memoization is just lookup in table of function results
6
+- Software solution and hardware for increased gains
7
+
8
+# Methodology
9
+
10
+- Identify functions which can be memoized
11
+- Functions that can be are replaced with memoization wrapper which looks up in table, or reverts to default storing value in table
12
+- Global variables are considered as extra arguments
13
+
14
+# Critical Assessment
15
+
16
+- Achieves better speedup than load-time optimisation in previous work
17
+- Causes memory overhead
18
+- Inlining can increase code size
19
+- Hardware solution is heavyweight and even with powergating has area implications

+ 27 - 0
manmem.md

@@ -0,0 +1,27 @@
1
+# Summarise Contribution & Motivation
2
+
3
+- There are lots of safe programming languages
4
+- People don't use them
5
+- Garbage collection is the source of inefficiency in unsafe languages
6
+- Safe manual memory management is a potential solution to this.
7
+- Simply add a delete operator to free memory, and an exception if that memory is then dereferenced
8
+
9
+# Methodology
10
+
11
+- Programming model changes
12
+- Replace GC heap with manually managed heap allocated from new keyword
13
+- New delete operator
14
+- Guarantee memory safety with new exception
15
+- Does not impact on the compiler or programmer too much, no restriction on aliasing
16
+- Delete semantics are intentionally weak for performance reasons, but maintain safety from use-after-free bugs
17
+- Uses 64 bit hardware to assign each object new virtual addresses _without_ reusing one until safe to do so
18
+- The processors MMU will then detect violations as objects are unmapped from the applications address space
19
+- Other operations allocate objects on new virtual pages as virtual operations are only allowed on pages
20
+- Included an allocator in .NET toolchain
21
+
22
+# Critical Assessment
23
+
24
+- Good solution that builds on other works shortcomings
25
+- Still places burden on programmer unlikely to make changes
26
+- Using hardware to detect violations is good idea to keep overhead low
27
+- Somewhat non-deterministic, but this doesn't matter as the original program also was and they address this(extensive testing + debug option)

+ 29 - 0
webasm.md

@@ -0,0 +1,29 @@
1
+# Summarise Contribution & Motivation
2
+
3
+- Web is becoming mature and now requires more complex applications to run on it
4
+- Javascript is currently only built in language
5
+- Javascript is not well equipped to deal with these applications or be a compile target
6
+- WebAssembly is a low level bytecode for the web
7
+- Aims to provide safe, low overhead execution
8
+- Better solution than plugins for safety
9
+- Better than asm.js for consistent performance
10
+
11
+# Methodology
12
+
13
+- Defines modules for each binary and therefore allows imports
14
+- Defines functions, not first class and not nested, call stack not exposed
15
+- Instructions based in a stack machine, for compactness
16
+- Only defines 4 types, integers and IEEE floating points
17
+- Has global and local variables
18
+- Memories defined by modules with little endianness, disjoint from code space and stack so programs can only mess up their own environment
19
+- Does not offer simple jumps, has structured control flow, gives single pass validation/compilation/SSA
20
+- Can do JIT and validation in single pass
21
+- Transmitted over wire in binary form, streaming compilation possible due to layout
22
+
23
+# Critical Assessment
24
+
25
+- Improves on native client by being available on all browsers as it is not compact and still requires knowledge of underlying system
26
+- Traps are not handled but need JS intervention
27
+- Still missing features for higher level languages
28
+- A compile target for the web with safety by design should prevent many exploits
29
+-