ࡱ> EGF)( / 0LDArialS 00TT.Eܖ 0ܖ@ .  @n?" dd@  @@`` L0    0AA@wʚ;ʚ;g4CdCd0E 0ppp@ <4dddd k 0T-E <4BdBd l 0T80___PPT10 %< Creating Data Repositories.. &Sanjay Rao ECE Dept, Purdue University Group MembersDave Maltz Rebecca Issacs Ratul Mahajan Yin Zhang Aditya Akella David Kotz Charles DiFatta & .. `Z`    MotivationNetwork Management Research: Barrier to entry is high Data/insights from operators/industry critical Examples: Failure characterization of enterprise network VLAN characterization and use Configuration ManagementLI fI fWhat happens today..?End-user centric measurement studies Network  black-box : no operator involvement Real need:  white-box Campus Networks Difficulties in bootstrapping relationships with operators Enterprise/Operator Network Sprint or AT&T (Microsoft with end-user) Limited pool of researchers Data across multiple enterprises?? Trends over many years ??%PDPP=PPGPP=P%D=G  = Bottomline Need a data repository Contributors from operators, researchers, industry Accessible to all researchers Facilitate research much like Planetlab Vital to have  critical mass of researchers on Network Management Research along high-impact real problems LQk+Qk+ oData Sharing: what inhibits it?jSensitivity of data Security Issues (firewall policies, network structure) Privacy Issues (records of individual activity) Proprietary nature of data E.g. how many calls got, mobility models Possible to have others use it?  Secret weapon for research Competition Vs. collaboration Inertia/ too much effortZgZZZIZZZZZgI      Solutions^Carrots/sticks to promote data sharing  Must release data to publish IMC: best paper award only to work releasing data. Technical ways to addressing concerns with sharing `'S4'S4 Positive Example  Research: Anonymization Hiding provider, hiding individual information Need framework to reason about it What trade-offs do you make? What risks are posed? How to expose trade-offs in a way we can appreciate? Anonymization very domain specific E.g. configuration file Vs. packet trace Are there common themes? Other Models: NDA-based  Give me a question ->  return answer  Exploratory nature of research Q Z ZKZ#ZBZZTZZQK#B    T "Community effort: Cooperate on IRB##(Social Sciences: Lots of experience with IRB Networking: Lack of clear guidelines on IRB process Admins feel happier if IRB can  sanction things As community: Must appreciate need/process for IRB Develop guidelines for IRB process Share IRB documents Y\ Y\  aCreating shareable data*75% of time spent figuring how to use data Researcher needs vary Different forms of datum Historical Vs. Streaming Dated? Trending? Assumptions made/gaps in data  timing info crucial at sub-RTT level ? Sharing hard, many idiosyncrasies Data collection infrastructure, annotateAZ2ZZFZ"Z)ZA2F") User Diagnostics\One-on-one: exact data provided Create shared repository(ies) What data do most users want? Is that 20% of stuff most critical to provide? Data Collection Tools Meta-data part of problem Create data in standard formats  Observatory : How to discover, describe, explain data Access policy, use policy >ZMZ0Z0ZCZZ>M00C  . Other <Streaming Data: Online Vs Offline Scalable collection: What to collect? Over how long? Compression techniques Fine-grained: overhead, coarse-grained: information loss What does it take to build this infrastructure? Get all types of data as painlessly as possible Massage, orchestrate data to fit researcher needs Simple APIs to get data out  fast analysis tools Federated Access DataManagement - Lifecycle of data\7PpP0PP7p0| Action ItemsCommunity-Wide Efforts: Initiate efforts to create data repository How to manage? Who contributes? Who arbitrates How much storage? Lifecycle - How long to store data? Create IRB guidelines for networking data Research: Anonymization Usage diagnostics -> what to collect,release: widely applicable Data Collection Tools, metadata information Industry,operators must be as actively involved as possible P.PeP+P PzP=P.e+ z  => @+   0` 33` Sf3f` 33g` f` www3PP` ZXdbmo` \ғ3y`Ӣ` 3f3ff` 3f3FKf` hk]wwwfܹ` ff>>\`Y{ff` R>&- {p_/̴>?" dd@,|?" dd@   " @ ` n?" dd@   @@``PR    @ ` ` p>> f(    6eE  `} E T Click to edit Master title style! !  0hE  ` E RClick to edit Master text styles Second level Third level Fourth level Fifth level!     S  0oE ^ ` E >*  0tE ^  E @*  0yE ^ ` E @*H  0޽h ? 3380___PPT10.Lq2 Default Design$ 0  $(  r  S xE>  r  S xE `   E H  0޽h ? 3380___PPT10.Lq2$  0 @$(  r  S <6 `}   r  S 7 `  H  0޽h ? 3380___PPT10.L0A$  0 P$(  r  S ; `}   r  S h< `  H  0޽h ? 3380___PPT10.MG+$  0 `$(  r  S PH0P   r  S (I   H  0޽h ? 3380___PPT10.MP$  0 p$(  r  S pX `}   r  S Q `  H  0޽h ? 3380___PPT10.O$  0 0 $(   r  S  `}   r  S Œ `  H  0޽h ? 3380___PPT10.L@l$  0 $$(  $r $ S  `}   r $ S ̤ `  H $ 0޽h ? 3380___PPT10.R<  0 (<(  (r ( S  `}    ( 6\ƒP" $Example: HSARPA  PREDICT : make research on network security possible. Firewalls and IDS network security data* f fH ( 0޽h ? 3380___PPT10.Suu$  0  $(   r  S 4 `}   r  S 4 `  H  0޽h ? 3380___PPT10.Q֯q0  0 0L0(  Lx L c $ CG `}  G x L c $hn: ` G H L 0޽h ? 3380___PPT10.TrY0  0 @0(  @x @ c $q `}   x @ c $r `  H @ 0޽h ? 3380___PPT10.T`g$  0 4$(  4r 4 S ~ `}   r 4 S  `  H 4 0޽h ? 3380___PPT10.TH$  0 <$(  <r < S ( `}   r < S 쉞 `  H < 0޽h ? 3380___PPT10.U`x.$  0  H$(  Hr H S 8G `}  G r H S  G ` G H H 0޽h ? 3380___PPT10.uBrP)|35X>7:,<E@B LNXJ P HS1Oh+'0 `h    (4< Creating Data Repositories.. Engineering Computer Network Engineering Computer Network46Microsoft Office PowerPoint@wW<2@IL@@ Gg  Y1  y--$xx--'@Arial-.  2 6."System:-@Arial-. 12 6Creating Data Repositories...-@Arial-.  2 6.-@Arial-. 2 K>Sanjay .-@Arial-.  2 KVRao.-@Arial-. 02 V"ECE Dept, Purdue UniversityA.-IX0JIEGN5YI==2 p'(5Lp'(5LItem Properties! This value indicates the number of saves or revisions. The application is responsible for updating this value after each revision. d:all> "http://schemas.microsoft.com/sharepoint/v3" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:dms="http://schemas.microsoft.com/office/2006/documentManagement/types"> This value indicates the number of saves or revisions. The application is responsible for updating this value after each revision. SDocumentSummaryInformation8 DMsoDataStorep'(5Lp'(5LCYVSPZKNVT1A==2 p'(5Lp'(5LItem Root EntrydO)p'(5LH@ Current User2SummaryInformation(*PowerPoint Document(S  !"#$%&'()+,-./01456789;2<:DIJK=  "#$_lSdmaltzdmaltzDocumentLibraryFormDocumentLibraryFormDocumentLibraryForm w> ՜.+,D՜.+,,     On-screen ShowPurdue UniversityS ArialDefault DesignCreating Data Repositories..Group Members MotivationWhat happens today..? Bottomline Data Sharing: what inhibits it? SolutionsPositive ExampleResearch: Anonymization#Community effort: Cooperate on IRBCreating shareable dataUser DiagnosticsOther Action Items  Fonts UsedDesign Template Slide Titles0PublishingExpirationDatePublishingStartDate 2018-11-21T15:59:05Z2000-01-01T00:00:00Z