In recent years, there has been tremendous progress in video understanding and action recognition. Current algorithms can reliably recognize the action the subject is performing in many unconstrained settings from third person viewpoints. Egocentric action recognition, however, trails behind the progress in third person views, although it has many applications in augmented reality, robotics and surveillance. Analyzing hands in action is a challenging computer vision problem due to reciprocal occlusions. The problem is more challenging from first person viewpoints due to the unique challenges brought by egocentric vision such as fast camera motion, large occlusions, background clutter and lack of datasets. To address the challenges of egocentric action recognition, a unified understanding of the positions and movements of hands and the manipulated objects is crucial. In this project, we explore efficient algorithms and methods for recognizing egocentric actions and hand-object interactions.