Securely Training Decision Trees Efficiently

31st Annual Conference on Computer and Communications Security (ACM CCS 2024) |

PDF

Decision trees are an important class of supervised learning algorithms. When multiple entities contribute data to train a decision tree (e.g. for fraud detection in the financial sector), data privacy concerns necessitate the use of a privacy-enhancing technology such as secure multi-party computation (MPC) in order to secure the underlying training data. Prior state-of-the-art (Hamada et al.) construct an MPC protocol for decision tree training with a communication of O (ℎ𝑚𝑁 log 𝑁 ), when building a decision tree of height for a training dataset of 𝑁 samples, each having 𝑚 attributes. 

In this work, we significantly reduce the communication complexity of secure decision tree training. We construct a protocol with communication complexity O (𝑚𝑁 log 𝑁 + ℎ𝑚𝑁 + ℎ𝑁 log 𝑁 ), thereby achieving an improvement of min(ℎ, 𝑚, log 𝑁 ) over Hamada et al. At the core of our technique is an improved protocol to regroup sorted private elements further into additional groups (according to a flag vector) while maintaining their relative ordering. We implement our protocol in the MP-SPDZ framework and show that it requires 10× lesser communication and is 9× faster than Hamada et al.