Securely Training Decision Trees Efficiently

31st Annual Conference on Computer and Communications Security (ACM CCS 2024) |

PDF

Decision trees are an important class of supervised learning algorithms. When multiple entities contribute data to train a decision tree (e.g. for fraud detection in the financial sector), data privacy concerns necessitate the use of a privacy-enhancing technology such as secure multi-party computation (MPC) in order to secure the underlying training data. Prior state-of-the-art (Hamada et al.) construct an MPC protocol for decision tree training with a communication of O (β„Žπ‘šπ‘ log 𝑁 ), when building a decision tree of height β„Ž for a training dataset of 𝑁 samples, each having π‘š attributes.Β 

In this work, we significantly reduce the communication complexity of secure decision tree training. We construct a protocol with communication complexity O (π‘šπ‘ log 𝑁 + β„Žπ‘šπ‘ + β„Žπ‘ log 𝑁 ), thereby achieving an improvement of β‰ˆ min(β„Ž, π‘š, log 𝑁 ) over Hamada et al. At the core of our technique is an improved protocol to regroup sorted private elements further into additional groups (according to a flag vector) while maintaining their relative ordering. We implement our protocol in the MP-SPDZ frameworkΒ and show that it requires 10Γ— lesser communication and is 9Γ— faster than Hamada et al.