{"id":36780,"date":"2020-07-07T15:00:41","date_gmt":"2020-07-07T14:00:41","guid":{"rendered":""},"modified":"2020-07-02T20:43:01","modified_gmt":"2020-07-02T19:43:01","slug":"how-to-operationalise-your-data-lake","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-gb\/industry\/blog\/technetuk\/2020\/07\/07\/how-to-operationalise-your-data-lake\/","title":{"rendered":"How to Operationalise your Data Lake"},"content":{"rendered":"

\"The<\/p>\n

Data lake operationalisation is a colossal topic<\/span>\u00a0with many\u00a0<\/span>deliberat<\/span>ions\u00a0<\/span>on either building the right data lake or defining the right strategy<\/span>.\u00a0<\/span>The <\/span>five important <\/span>points <\/span>that everyone stresses on\u00a0<\/span>prior to starting the process of building a data lake<\/span>\u00a0are:<\/span>\u00a0<\/span>\u00a0<\/span><\/p>\n

\"The<\/p>\n

T<\/span>his blog\u00a0<\/span>provides six mantras\u00a0<\/span>for organisations <\/span>to\u00a0<\/span>ruminate on <\/span>i<\/span>n<\/span>\u00a0order<\/span>\u00a0<\/span>to successfully tame the \u201cOperationalising\u201d of a data lake,<\/span>\u00a0post production release<\/span>.<\/span>\u00a0<\/span><\/p>\n

\u202f<\/span>\u00a0<\/span><\/p>\n

1. ALWAYS have a North star Architecture<\/h2>\n

D<\/span>ata lakes are not only about pooling data, but also <\/span>dealing with\u00a0<\/span>aspects of its consumption<\/span>.<\/span>\u00a0The choice of data lake pattern depends on the masterpiece one wants to paint<\/span>.<\/span>\u00a0<\/span><\/p>\n

Central vs Federated<\/span>\u00a0<\/span>vs Hybrid\u00a0<\/span>\u00a0<\/span><\/h3>\n

Depending on the ask of the organisation, you can choose <\/span>to store the enterprise data either all in one\u00a0<\/span>location<\/span>\u00a0(Central)<\/span>\u00a0closest to the\u00a0<\/span>organisation\u2019s<\/span> headquarters, or\u00a0<\/span>due to\u00a0<\/span>sovereignty<\/span>\u00a0<\/span>requirements, keep the <\/span>data<\/span>\u00a0stored<\/span>\u00a0<\/span>in\u00a0<\/span>their specific subsidiaries (Federated)<\/span>.<\/span><\/p>\n

If\u00a0<\/span>a<\/span>n enterprise has a Global footprint<\/span>, adopting a Hub and Spoke model\u00a0<\/span>(Hybrid) <\/span>with a satellite<\/span> of<\/span>\u00a0local data\u00a0<\/span>closer to the reporting countries<\/span>\u00a0would do the trick<\/span>. Even though this model will have alignment issues (<\/span>data replication etc.)<\/span>\u00a0it will aid performance, regional governance and development<\/span>.<\/span>\u00a0(Fig 1)<\/span>\u202f<\/span><\/p>\n

\"Figure\u00a01\u00a0\u2013\u00a0Hybrid\u00a0Architecture\u00a0\"<\/p>\n

Figure\u00a01\u00a0\u2013\u00a0Hybrid\u00a0Architecture\u00a0<\/em><\/p>\n

\u202f<\/span><\/p>\n

Streamed vs. Batch vs. Near Real Time<\/span>\u00a0<\/span><\/h3>\n