forked from h2oai/h2o-3
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCHANGES.log
68 lines (62 loc) · 2.86 KB
/
CHANGES.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
GOALS
- More Dev Friendly, especially external devs
- - jars downloaded via maven/gradle
- - - Trim down final H2O.jar (8M down from 50M)
- - - One Jar but no unzip one-jar into /tmp
- - CRAN-acceptable jar size (under 5meg?)
- - no class loader
- - run JUnits direct from IDE
- Slim & Trim Java API
- - drop all directories; from scratch clean-slate
- - drop all VA things
- - add code back in AFTER scrub
- - - Only what's needed; and made private/protected
- - Restructure worst offenders (Log, H2O)
- Existing tests should pass (minus VA)
- Same-As performance & memory consumption
- Replace
- - API, Model, Job, Auto-Gen: Documentation, R-bindings
- NOT: Multi-tenency, Resource-Control
- Resource management memory: ref-cnt
- Resource management CPU: CPS-style
Scala M/R functions???
-----------------------------
Major style & code changes from "h2o":
- Drop the "java" directory layer.
src/main/jsr166y - D.L. F/J. Appears in 1.7?
main/water - Core K/V & F/J & M/R code
water/init - Cluster & network init; weaver
water/nbhm - Nonblocking HashMap; Unsafe util
water/persist - Persist layer; NFS, HFDS, S3 access
water/util - Utilities
test/water - Mirror directory to main with tests
- Use Maven to fetch external jar files.
- "One Jar" is done by unpacking/repacking all jars, making a single large jar
that does not need any unpack step. Downside: class-file collisions are
silently ignored (Jar Hell), with the last to unpack winning. H2O classes
always win, but e.g. conflicting emebedded versions of log4j will end with
the last log4j class unpacked.
- Make "mostly" works - sometimes over-compiles, if there are errors.
- Move much init code into "water/init", including all network init from H2O.java
- Trim the API. Make All Things Private by default. Aggressive IdeaJ warning removal.
- Switch to 1.7 javac by default
- Do not hook stdout & stderr
- Complete Log class overhaul
- Complete Excel file parsing overhaul; dropped poi.jar
- Remove all the Embedded support; emeddable-by-default now
- Weaver uses Delegation instead of a Class Loader. Serialization bases need
boilerplate, and there's several more v-calls to the start of any
serialization. Theory says: only static calls to recursively serialize any
object; should be fast enough.
- Overhaul Vec rollups - try to compute them only from the home node so we do
not get racing Vec rollups for the same Vec from many nodes.
- DKV key prefetch instead of F/J background task gets
- TaskPutKey requires a Futures - no fire-and-forget remote puts - you gotta "remember" them later
- Removable into Lockable from UKV
DETAIL CHANGES
- public ==> private or protected (lots of these!)
- throw Log.err("msg",ex) ==> Log.throwErr(ex)
- MRTask2 ==> MRTask
- H2O.D() moved to Key.D()
- H2O.Cleaner moved to Cleaner.java
- Array utils from util.Utils to util.Arrays