Providing modern data transfer and storage service at Microsoft with Microsoft Azure

|

Two people work at their PCs in an office.
Deploying a Modern Data Transfer Service internally at Microsoft is providing the company with an enterprise-scale system for transferring large files in and out of its firewall.

Microsoft Digital technical storiesCompanies all over the world have launched their cloud adoption journey. While some are just starting, others are further along the path and are now researching the best options for moving their largest, most complex workflows to the cloud. It can take time for companies to address legacy tools and systems that have on-premises infrastructure dependencies.

Our Microsoft Digital Employee Experience (MDEE) team has been running our company as mostly cloud-only since 2018, and continues to design cloud-only solutions to help fulfill our Internet First and Microsoft Zero Trust goals.

In MDEE, we designed a Modern Data Transfer Service (MDTS), an enterprise-scale solution that allows the transfer of large files to and from partners outside the firewall and removes the need for an extranet.

MDTS makes cloud adoption easier for teams inside Microsoft and encourages the use of Microsoft Azure for all of their data transfer and storage scenarios. As a result, engineering teams can focus on building software and shipping products instead of dealing with the management overhead of Microsoft Azure subscriptions and becoming subject matter experts on infrastructure.

[Unpack simplifying Microsoft’s royalty ecosystem with connected data service. | Check out how Microsoft employees are leveraging the cloud for file storage with OneDrive Folder Backup. | Read more on simplifying compliance evidence management with Microsoft Azure confidential ledger.]

Leveraging our knowledge and experience

As part of Microsoft’s cloud adoption journey, we have been continuously looking for opportunities to help other organizations move data and remaining legacy workflows to the cloud. With more than 220,000 employees and over 150 partners that data is shared with, not every team had a clear path for converting their transfer and storage patterns into successful cloud scenarios.

We have a high level of Microsoft Azure service knowledge and expertise when it comes to storage and data transfer. We also have a long history with legacy on-premises storage designs and hybrid third-party cloud designs. Over the past decade, we engineered several data transfer and storage services to facilitate the needs of Microsoft engineering teams. Those services traditionally leveraged either on-premises designs or hybrid designs with some cloud storage. In 2019, we began to seriously look at replacing our hybrid model, which included a mix of on-premises resources, third party software, and Microsoft Azure services, with one modern service that would completely satisfy our customer scenarios using only Azure—thanks to new capabilities in Azure making it possible and it being the right time.

MDTS uses out of the box Microsoft Azure storage configurations and capabilities to help us address legacy on-premises storage patterns and support Microsoft core commitments to fully adopt Azure in a way that satisfies security requirements. Managed by a dedicated team of service engineers, program managers, and software developers, MDTS offers performance, security, and is available to any engineering team at Microsoft that needs to move their data storage and transfer to the cloud.

Designing a Modern Data Transfer and Storage Service

The design goal for MDTS was to create a single storage service offering entirely in Microsoft Azure, that would be flexible enough to meet the needs of most engineering teams at Microsoft. The service needed to be sustainable as a long-term solution, continue to support ongoing Internet First and Zero Trust Network security designs, and have the capability to adapt to evolving technology and security requirements.

Identifying use cases

First, we needed to identify the top use cases we wanted to solve and evaluate which combination of Microsoft Azure services would help us meet our requirements. The primary use cases we identified for our design included:

  • Sharing and/or distribution of complex payloads: We not only had to provide storage for corporate sharing needs, but also share those same materials externally. The variety of file sizes and different payload characteristics can be challenging because they don’t always fit a standard profile for files (e.g., Office docs, etc.).
  • Cloud storage adoption (shifting from on-premises to cloud): We wanted to ensure that engineering teams across Microsoft that needed a path to the cloud would have a roadmap. This need could arise because of expiring on-premises infrastructure, corporate direction, or other modernization initiatives like ours.
  • Consolidation of multiple storage solutions into one service, to reduce security risks and administrative overhead: Having to place data and content in multiple storage datastores for the purposes of specific sharing or performance needs is both cumbersome and can introduce additional risk. Because there wasn’t yet a single service that could meet all their sharing needs and performance requirements, employees and teams at Microsoft were using a variety of locations and services to store and share data.

Security, performance, and user experience design requirements

After identifying the use cases for MDTS, we focused on our primary design requirements. They fell into three high-level categories: security, performance, and user experience.

Security

The data transfer and storage design needed to follow our Internet First and Zero Trust network design principles. Accomplishing parity with Zero Trust meant leveraging best practices for encryption, standard ports, and authentication. At Microsoft, we already have standard design patterns that define how these pieces should be delivered.

  • Encryption: Data is encrypted both in transit and at rest.
  • Authentication: Microsoft Azure Active Directory supports both corporate synced domain accounts, external business-to-business accounts, as well as corporate and external security groups. Leveraging Azure Active Directory allows teams to remove dependencies on corporate domain controllers for authentication.
  • Authorization: Microsoft Azure Data Lake Gen2 storage provides fine grained access to containers and subfolders. This is possible because of many new capabilities, most notably the support for OAuth, hierarchical name space, and POSIX permissions. These capabilities are necessities of a Zero Trust network security design.
  • No non-standard ports: Opening non-standard ports can present a security risk. Using only HTTPS and TCP 443 as the mechanisms for transport and communication prevents opening non-standard ports. This includes having software capable of transport that maximizes the ingress/egress capabilities of the storage platform. Microsoft Azure Storage Explorer, AzCopy, and Microsoft Azure Data Factory meet the no non-standard ports requirement.

Performance

Payloads can range from being comprised of one very large file, millions of small files, and every combination in between. Scenarios across the payload spectrum have their own computing and storage performance considerations and challenges. Microsoft Azure has optimized software solutions for achieving the best possible storage ingress and egress. MDTS helps ensure that customers know what optimized solutions are available to them, provides configuration best practices, and shares the learnings with Azure Engineering to enable robust enterprise scale scenarios.

  • Data transfer speeds: Having software capable of maximizing the ingress/egress capabilities of the storage platform is preferable for engineering-type workloads. It’s common for these workloads to have complex payloads, payloads with several large files (10-500 GB) or millions of small files.
  • Ingress and egress: Support for ingress upwards of 10 Gbps and egress of 50 Gbps. Furthermore, client and server software that can consume the maximum amount of bandwidth possible up to the maximum amount in ingress/egress possible on client and storage.

 

Data size/ bandwidth 50 Mbps 100 Mbps 500 Mbps 1 Gbps 5 Gbps 10 Gbps
1 GB 2.7 minutes 1.4 minutes 0.3 minutes 0.1 minutes 0.03 minutes 0.010 minutes
10 GB 27.3 minutes 13.7 minutes 2.7 minutes 1.3 minutes 0.3 minutes >0.1 minutes
100 GB 4.6 hours 2.3 hours 0.5 hours 0.2 hours 0.05 hours 0.02 hours
1 TB 46.6 hours 23.3 hours 4.7 hours 2.3 hours 0.5 hours 0.2 hours
10 TB 19.4 days 9.7 days 1.9 days 0.9 days 0.2 days 0.1 days

Copy duration calculations based on data size and the bandwidth limit for the environment.

User experience

Users and systems need a way to perform manual and automated storage actions with graphical, command line, or API-initiated experiences.

  • Graphical user experience: Microsoft Azure Storage Explorer provides Storage Admins the ability to graphically manage storage. It also has storage consumer features for those who don’t have permissions for Administrative actions, and simply need to perform common storage actions like uploading, downloading, etc.
  • Command line experience: AzCopy provides developers with an easy way to automate common storage actions through CLI or scheduled tasks.
  • Automated experiences: Both Microsoft Azure Data Factory and AzCopy provide the ability for applications to use Azure Data Lake Gen2 storage as its primary storage source and destination.

Identifying personas

Because a diverse set of personas utilize storage for different purposes, we need to design storage experiences that satisfy the range of business needs. Through the process of development, we identified these custom persona experiences relevant to both storage and data transfer:

  • Storage Admins: The Storage Admins are Microsoft Azure subscription owners. Within the Azure subscription they create, manage, and maintain all aspects of MDTS: Storage Accounts, Data Factories, Storage Actions Service, and Self-Service Portal. Storage Admins also resolve requests and incidents that are not handled via Self-Service.
  • Data Owners: The Data Owner personas are those requesting storage who have the authority to create shares and authorize storage. Data Owners also perform the initial steps of creating automated distributions of data to and from private sites. Data Owners are essentially the decision makers of the storage following handoff of a storage account from Storage Admins.
  • Storage Consumers: At Microsoft, storage consumers represent a broad set of disciplines, from engineers and developers to project managers and marketing professionals. Storage Consumers can use Microsoft Azure Storage Explorer to perform storage actions to and from authorized storage paths (aka Shares). Within the MDTS Self Service Portal, a storage consumer can be given authorization to create distributions. A distribution can automate the transfer of data from a source to one or multiple destinations.

Implementing and enhancing the solution architecture

After considering multiple Microsoft Azure storage types and complimentary Azure Services, the MDTS team chose the following Microsoft Azure services and software as the foundation for offering a storage and data transfer service to Microsoft Engineering Groups.

  • Microsoft Azure Active Directory: Meets the requirements for authentication and access.
  • Microsoft Azure Data Lake Gen2: Meets security and performance requirements by providing encryption, OAuth, Hierarical namespace, fine grained authorization to Azure Active Directory entities, and 10+ GB per sec ingress and egress.
  • Microsoft Azure Storage Explorer: Meets security, performance, and user experience requirements by providing a graphical experience to perform storage administrative tasks and storage consumer tasks without needing a storage account key or role based access (RBAC) on an Azure resource. Azure Storage Explorer also has AzCopy embedded to satisfy performance for complex payloads.
  • AzCopy: Provides a robust and highly performant command line interface.
  • Microsoft Azure Data Factory: Meets the requirements for orchestrating and automating data copies between private networks and Azure Data Lake Gen2 storage paths. Azure Data Factory copy activities are equally as performant as AzCopy and satisfy security requriements.

Enabling Storage and Orchestration

As illustrated below, the first MDTS design was comprised entirely of Microsoft Azure Services with no additional investment from us other than people to manage the Microsoft Azure subscription and perform routine requests. MDTS was offered as a commodity service to engineering teams at Microsoft in January 2020. Within a few months we saw a reduction of third-party software and on-premises file server storage, which provided significant savings. This migration also contributed progress towards the company-wide objectives of Internet First and Zero Trust design patterns.

The first design of MDTS provides storage and orchestration using out of the box Microsoft Azure services.
The first design of MDTS provides storage and orchestration using out of the box Microsoft Azure services.

We initially onboarded 35 engineering teams which included 10,000 Microsoft Azure Storage Explorer users (internal and external accounts), and 600 TB per month of Microsoft Azure storage uploads and downloads. By offering the MDTS service, we saved engineering teams from having to run Azure subscriptions themselves and needing to learn the technical details of implementing a modern cloud storage solution.

Creating access control models

As a team, we quickly discovered that having specific repeatable implementation strategies was essential when configuring public facing Microsoft Azure storage. Our initial time investment was in standardizing an access control process which would simplify complexity and ensure a correct security posture before handing off storage to customers. To do this, we constructed onboarding processes for identifying the type of share for which we standardized the implementation steps.

We implemented standard access control models for two types of shares: container shares and sub-shares.

Container share access control model

The container share access control model is used for scenarios where the data owner prefers users to have access to a broad set of data. As illustrated in the graphic below, container shares supply access to the root, or parent, of a folder hierarchy. The container is the parent. Any member of the security group will gain access to the top level. When creating a container share, we also make it possible to convert to a sub-share access control model if desired.

 

Microsoft Azure Storage Explorer grants access to the root of a folder hierarchy using the container share access control model.
Microsoft Azure Storage Explorer grants access to the root, or parent, of a folder hierarchy using the container share access control model. Both engineering and marketing are containers. Each has a specific Microsoft Azure Active Directory Security group. A top-level Microsoft Azure AD Security group is also added to minimize effort for users who should get access to all containers added to the storage account.

This model fits scenarios where group members get Read, Write, and Execute permissions to an entire container. The authorization allows users to upload, download, create, and/or delete folders and files. Making changes to the Access Control restricts access. For example, to create access permissions for download only, select Read and Execute.

Sub-share access control model

The sub-share access control model is used for scenarios where the data owner prefers users have explicit access to folders only. As illustrated in the graphic below, folders are hierarchically created under the container. In cases where several folders exist, a security group access control can be implemented on a specific folder. Access is granted to the folder where the access control is applied. This prevents users from seeing or navigating folders under the container other than the folders where an explicit access control is applied. When users attempt to browse the container, authorization will fail.

 

Microsoft Azure Storage Explorer grants access to sub-folder only using the sub-share access control model.
Microsoft Azure Storage Explorer grants access to sub-folder only using the sub-share access control model. Members are added to the sub-share group, not the container group. The sub-share group is nested in the container group with execute permissions to allow for Read and Write on the sub-share.

This model fits for scenarios where group members get Read, Write, and Execute permissions to a sub-folder only. The authorization allows users to upload, download, create folders/files, and delete folders/files. The access control is specific to the folder “project1.” In this model you can have multiple folders under the container, but only provide authorization to a specific folder.

The sub-share process is only applied if a sub-share is needed.

  • Any folder needing explicit authorization is considered a sub-share.
  • We apply a sub-share security group access control with Read, Write, and Execute on the folder.
  • We nest the sub-share security group in the parent share security group used for Execute only. This allows members who do not have access to the container enough authorization to Read, Write, and Execute the specific sub-share folder without having Read or Write permissions to any other folders in the container.

Applying access controls for each type of share (container and or sub-share)

The parent share process is standard for each storage account.

  • Each storge account has a unique security group. This security group will have access control applied for any containers. This allows data owners to add members and effectively give access to all containers (current and future) by simply changing the membership of one group.
  • Each container will have a unique security group for Read, Write, and Execute. This security group is used to isolate authorization to a single container.
  • Each container will have a unique group for execute. This security group is needed in the event sub-shares are created. Sub-shares are folder-specific shares in the hierarchical namespace.
  • We always use the default access control option. This is a feature that automatically applies the parent permissions to all new child folders (sub-folders).

The first design enabled us to offer MDTS while our engineers defined, designed, and developed an improved experience for all the personas. It quickly became evident that Storage Admins needed the ability to see an aggregate view of all storage actions in near real-time to successfully operate the service. It was important for our administrators to easily discover the most active accounts and which user, service principle, or managed service identity was making storage requests or performing storage actions. In July 2020, we added the Aggregate Storage Actions service.

Adding aggregate storage actions

For our second MDTS design, we augmented the out of the box Microsoft Azure Storage capabilities used in our first design with the capabilities of Microsoft Azure Monitor, Event Hubs, Stream Analytics, Function Apps, and Microsoft Azure Data Explorer to provide aggregate storage actions. Once the Aggregate Storage Actions capability was deployed and configured within MDTS, storage admins were able to aggregate the storage actions of all their storage accounts and see them in a single pane view.

 

The second design of MDTS introduces aggregate storage actions.
The second design of MDTS introduces aggregate storage actions.

The Microsoft Azure Storage Diagnostic settings in Microsoft Azure Portal makes it possible for us to configure specific settings for blob actions. Combining this feature with other Azure Services and some custom data manipulation gives MDTS the ability to see which users are performing storage actions, what those storage actions are, and when those actions were performed. The data visualizations are near real-time and aggregated across all the storage accounts.

Storage accounts are configured to route logs from Microsoft Azure Monitor to Event Hub. We currently have 45+ storage accounts that generate around five million logs each day. Data filtering, manipulation, and grouping is performed by Stream Analytics. Function Apps are responsible for fetching UPNs using Graph API, then pushing logs to Microsoft Azure Data Explorer. Microsoft Power BI and our modern self-service portal query Microsoft Azure Data Explorer and provide the visualizations, including dashboards with drill down functionality. The data available in our dashboard includes the following information aggregated across all customers (currently 35 storage accounts).

  • Aggregate view of most active accounts based on log activity.
  • Aggregate total of GB uploaded and download per storage account.
  • Top users who uploaded showing the user principal name (both external and internal).
  • Top users who downloaded showing the user principal name (both external and internal).
  • Top Accounts uploaded data.
  • Top Accounts downloaded data.

The only setting required to onboard new storage accounts is to configure them to route logs to the Event Hub. Because we can have an aggregate store of all the storage account activities, we are able to offer MDTS customers an account view into their storage account specific data.

Following the release of Aggregate Storage Actions, the MDTS team, along with feedback from customers, identified another area of investment—the need for storage customers to “self-service” and view account specific insights without having role-based access to the subscription or storage accounts.

Providing a self-service experience

To enhance the experience of the other personas, MDTS is now focused on the creation of a Microsoft Azure web portal where customers can self-service different storage and transfer capabilities without having to provide any Microsoft Azure Role Based Access (RBAC) to the underlying subscription that hosts the MDTS service.

When designing MDTS self-service capabilities we focused on meeting these primary goals:

  • Make it possible for Microsoft Azure Subscription owners (Storage Admins) to provide the platform and services while not needing to be in the middle of making changes to storage and transfer services.
  • The ability to create custom persona experiences so customers can achieve their storage and transfer goals through a single portal experience in a secure and intuitive way. Some of the new enterprise scale capabilities include:
    • Onboarding.
    • Creating storage shares.
    • Authorization changes.
    • Distributions. Automating the distribution of data from one source to one or multiple destinations.
    • Provide insights into storage actions (based on the data provided in Storage Actions enabled in our second MDTS release).
    • Reporting basic consumption data, like the number of users, groups, and shares on a particular account.
    • Reporting the cost of the account
  • As Azure services and customer scenarios change, the portal can also change.
  • If customers want to “self-host” (essentially take our investments and do it themselves), we will easily be able to accommodate.
Our next design of MDTS introduces a self-service portal.
Our next design of MDTS introduces a self-service portal.

Storage consumer user experiences

After storage is created and configured, data owners can then share steps for storage consumers to start using storage. Upload and download are the most common storage actions, and Microsoft Azure provides software and services needed to perform both actions for manual and automated scenarios.

Microsoft Azure Storage Explorer is recommended for manual scenarios where users can connect and perform high speed uploads and downloads manually. Both Microsoft Azure Data Factory and AzCopy can be used in scenarios where automation is needed. AzCopy is heavily preferred in scenarios where synchronization is required. Microsoft Azure Data Factory doesn’t provide synchronization but does provide robust data copy and data move. Azure Data Factory is also a managed service and better suited in enterprise scenarios where flexible triggering options, uptime, auto scale, monitoring, and metrics are required.

Using Microsoft Azure Storage Explorer for manual storage actions

Developers and Storage Admins are accustomed to using Microsoft Azure Storage Explorer for both storage administration and routine storage actions (e.g., uploading and downloading). Non-storage admin, otherwise known as Storage Consumers, can also use Microsoft Azure Storage Explorer to connect and perform storage actions without needing any role-based access control or access keys to the storage account. Once the storage is authorized, members of authorized groups can perform routine steps to attach the storage they are authorized for, authenticating with their work email, and leveraging the options based on their authorization.

The processes for sign-in and adding a resource via Microsoft Azure Active Directory are found in the Manage Accounts and Open Connect Dialogue options of Microsoft Azure Storage Explorer.

After signing in and selecting the option to add the resource via Microsoft Azure Active Directory, you can supply the storage URL and connect. Once connected, it only requires a few clicks to upload and download data.

 

Microsoft Azure Storage Explorer Local and Attached module.
Microsoft Azure Storage Explorer Local and Attached module. After following the add resource via Microsoft Azure AD process, the Azure AD group itshowcase-engineering is authorized to Read, Write, and Edit (rwe) and members of the group can perform storage actions.

To learn more about using Microsoft Azure Storage Explorer, Get started with Storage Explorer. There are additional links in the More Information section at the end of this document.

Note: Microsoft Azure Storage Explorer uses AzCopy. Having AzCopy as the transport allows storage consumers to benefit from high-speed transfers. If desired, AzCopy can be used as a stand-alone command line application.

Using AzCopy for manual or automated storage actions

AzCopy is a command line interface used to perform storage actions on authorized paths. AzCopy is used in Microsoft Azure Storage Explorer but can also be used as a standalone executable to automate storage actions. It’s a multi-stream TCP based transport capable of optimizing throughput based on the bandwidth available. MDTS customers use AzCopy in scenarios which require synchronization or cases where Microsoft Azure Storage Explorer or Microsoft Azure Data Factory copy activity doesn’t meet the requirements for data transfer. For more information about using AzCopy please see the More Information section at the end of this document.

AzCopy is a great match for standalone and synchronization scenarios. It also has options that are useful when seeking to automate or build applications. Because AzCopy is a single executable running on either a single client or server system, it isn’t always ideal for enterprise scenarios. Microsoft Azure Data Factory is a more robust Microsoft Azure service that meets the most enterprise needs.

Using Microsoft Azure Data Factory for automated copy activity

Some of the teams that use MDTS require the ability to orchestrate and operationalize storage uploads and downloads. Before MDTS, we would have either built a custom service or licensed a third-party solution, which can be expensive and/or time consuming.

Microsoft Azure Data Factory, a cloud-based ETL and data integration service, allows us to create data-driven workflows for orchestrating data movement. Including Azure Data Factory in our storage hosting service model provided customers with a way to automate data copy activities. MDTS’s most common data movement scenarios are distributing builds from a single source to multiple destinations (3-5 destinations are common).

Another requirement for MSTS was to leverage private data stores as a source or destination. Microsoft Azure Data Factory provides the capability to use a private system, also known as a self-hosted integration runtime. When configured this system can be used in copy activity communicating with on-premises file systems. The on-premises file system can then be used as a source and/or destination datastore.

In the situation where on-premises file system data needs to be stored in Microsoft Azure or shared with external partners, Microsoft Azure Data Factory provides the ability to orchestrate pipelines which perform one or multiple copy activities in sequence. These activities result in end-to-end data movement from one on-premises file systems to Microsoft Azure Storage, and then to another private system if desired.

The graphic below provides an example of a pipeline orchestrated to copy builds from a single source to several private destinations.

 

Detailed example of a Microsoft Azure Data Factory pipeline, including the build system source.
Microsoft Azure Data Factory pipeline example. Private site 1 is the build system source. Build system will build, load the source file system, then trigger the Microsoft Azure Data Factory pipeline. Build is then uploaded, then Private sites 2, 3, 4 will download. Function apps are used for sending email notifications to site owners and additional validation.

For more information on Azure Data Factory, please see Introduction to Microsoft Azure Data Factory. There are additional links in the More Information section at the end of this document.

If you are thinking about using Microsoft Azure to develop a modern data transfer and storage solution for your organization, here are some of the best practices we gathered while developing MDTS.

Close the technical gap for storage consumers with a white glove approach to onboarding

Be prepared to spend time with customers who are initially overwhelmed with using Azure Storage Explorer or AzCopy. At Microsoft, storage consumers represent a broad set of disciplines—from engineers and developers to project managers and marketing professionals. Azure Storage Explorer provides an excellent experience for engineers and developers but can be a little challenging for less technical roles.

Have a standard access control model

Use Microsoft Azure Active Directory security groups and group nesting to manage authorization; Microsoft Azure Data Lake Gen2 storage has a limit to the number of Access Controls you can apply. To avoid reaching this limit, and to simplify administration, we recommend using Microsoft Azure Active Directory security groups. We apply the access control to the security group only, and in some cases, we nest other security groups within the access control group. We nest Member Security Groups within Access Control Security Groups to manage access. These group types don’t exist in Microsoft Azure Active Directory but do exist within our MDTS service as a process to differentiate the purpose of a group. We can easily determine this differentiation by the name of the group.

  • Access Control Security Groups: We use this group type for applying Access Control on ADLS Gen2 storage containers and/or folders.
  • Member Security Groups: We use these to satisfy cases where access to containers and/or folders will constantly change for members.

When there are large numbers of members, nesting prevents the need to add members individually to the Access Control Security Groups. When access is no longer needed, we can remove the Member Group(s) from the Access Control Security Group and no further action is needed on storage objects.

Along with using Microsoft Azure Active Directory security groups, make sure to have a documented process for applying access controls. Be consistent and have a way of tracking where access controls are applied.

Use descriptive display names for your Microsoft Azure AD security groups

Because Microsoft Azure AD doesn’t currently organize groups by owners, we recommend using naming conventions that capture the group’s purpose and type to allow for easier searches.

  • Example 1: mdts-ac-storageacct1-rwe. This group name uses our service standard naming convention for Access Control group type on Storage Account 1, with access control Read, Write, and Execute. mdts = Service, ac = Access Control Type, storageacct1 = ADLS Gen2 Storage Account Name, rwe = permission of the access control.
  • Example 2: mdts-mg-storageacct1-project1. This group name uses our service standard naming convention for Member Group type on Storage Account 1. This group does not have an explicit access control on storage, but it is nested in mdts-ac-storageacct1-rwe where any member of this group has the Read, Write, and Execute access to storage account1 because it’s nested in mdts-ac-storageacct1-rwe.

Remember to propagate any changes to access controls

Microsoft Azure Data Lake Gen2 storage, by default, doesn’t automatically propagate any access control changes. As such, when removing, adding, or changing an access control, you need to follow an additional step to propagate the access control list. This option is available in Microsoft Azure Storage Explorer.

Storage Consumers can attempt Administrative options

Storage Consumers use Microsoft Azure Storage Explorer and are authenticated with their Microsoft Azure Active Directory user profile. Since Azure Storage Explorer is primarily developed for Storage Admin and Developer personas, all administrative actions are visible. It is common for storage consumers to attempt administrative actions, like managing access or deleting a container. Those actions will fail due to only being accessed via access control lists (ACLs). There isn’t a way to provide administration actions via ACL’s. If administrative actions are needed, then users will become a Storage Admin which has access via Azure’s Role Based Access Control (RBAC).

Microsoft Azure Storage Explorer and AzCopy are throughput intensive

As stated above, AzCopy is leveraged by Microsoft Azure Storage Explorer for transport actions. When using Azure Storage Explorer or AzCopy it’s important to understand that transfer performance is its specialty. Because of this, some clients and/or networks may benefit from throttling AzCopy’s performance. In circumstances where you don’t want AzCopy to consume too much network bandwidth, there are configurations available. In Microsoft Azure Storage Explorer use the Settings option and select the Transfers section to configure Network Concurrency and/or File Concurrency. In the Network Concurrency section, Adjust Dynamically is a default option. For AzCopy, there are flags and environment variables available to optimize performance.

For more information, visit Configure, optimize, and troubleshoot AzCopy.

Microsoft Azure Storage Explorer sign-in with MSAL

Microsoft Authentication Library, currently in product preview, provides enhanced single sign-on, multi-factor authentication, and conditional access support. In some situations, users won’t authenticate unless MSAL is selected. To enable MSAL, select the Setting option from Microsoft Azure Storage Explorer’s navigation pane. Then in the application section, select the option to enable Microsoft Authentication Library.

B2B invites are needed for external accounts (guest user access)

When there is a Microsoft business need to work with external partners, leveraging guest user access in Microsoft Azure Active Directory is necessary. Once the B2B invite process is followed, external accounts can be authorized by managing group membership. For more information, read What is B2B collaboration in Azure Active Directory?

Key Takeaways

We used Microsoft Azure products and services to create an end-to-end modern data transfer and storage service that can be used by any group at Microsoft that desires cloud data storage. The release of Microsoft Azure Data Lake Gen 2, Microsoft Azure Data Factory, and the improvements in the latest release of Azure Storage Explorer made it possible for us to offer MDTS as a fully native Microsoft Azure service.

One of the many strengths of using Microsoft Azure is the ability to use only what we needed, as we needed it. For MDTS, we started by simply creating storage accounts, requesting Microsoft Azure Active Directory Security Groups, applying an access control to storage URLs, and releasing the storage to customers for use. We then invested in adding storage actions and developed self-service capabilities that make MDTS a true enterprise-scale solution for data transfer and storage in the cloud.

We are actively encouraging the adoption of our MDTS storage design to all Microsoft engineering teams that still rely on legacy storage hosted in the Microsoft Corporate network. We are also encouraging any Microsoft Azure consumers to consider this design when evaluating options for storage and file sharing scenarios. Our design has proven to be scalable, compliant, and performant with the Microsoft Zero Trust security initiative, handling extreme payloads with high throughput and no constraints on the size or number of files.

By eliminating our dependency on third-party software, we have been able to eliminate third-party licensing, consulting, and hosting costs for many on-premises storage systems.

Are you ready to learn more? Sign up for your own Microsoft Azure subscription and get started today.

To receive the latest updates on Azure storage products and features to meet your cloud investment needs, visit Microsoft Azure updates.

Related links

 

Recent