Crystal Structure Analysis (CSA) Documentation

Python Version PyTorch GPU License

Welcome to the Crystal Structure Analysis (CSA) documentation. CSA is a comprehensive Python framework for extracting, processing, and analyzing molecular crystal structures from the Cambridge Structural Database (CSD).

🚀 Key Features

  • High-Performance Pipeline: GPU-accelerated batch processing with PyTorch

  • CSD Integration: Direct interface to Cambridge Structural Database

  • Advanced Analytics: Fragment analysis, intermolecular contacts, and geometric descriptors

  • Efficient Storage: HDF5-based data management with variable-length datasets

  • Scalable Architecture: Parallel processing for large datasets

📖 Quick Navigation

🏃 Getting Started

New to CSA? Start here for installation, configuration, and your first analysis.

Getting Started
📚 User Guide

Learn the core concepts and workflow of the CSA pipeline.

User Guide
🎯 Tutorials

Step-by-step tutorials for common analysis scenarios.

Tutorials
🔧 API Reference

Complete API documentation for all modules and classes.

API Reference
💡 Examples

Ready-to-run code examples for various use cases.

Examples
⚙️ Technical Details

Deep dive into algorithms, architecture, and performance.

Performance Architecture

🔬 What CSA Does

CSA transforms raw crystallographic data into rich, analysis-ready datasets through a five-stage pipeline:

  1. Family Extraction - Query and organize CSD structures by chemical families

  2. Similarity Clustering - Group structures by 3D packing similarity

  3. Representative Selection - Choose optimal structures using statistical metrics

  4. Data Extraction - Extract atomic coordinates, bonds, and intermolecular contacts

  5. Feature Engineering - Compute advanced geometric and topological descriptors

Note

CSA requires a valid Cambridge Crystallographic Data Centre (CCDC) license for full functionality.

📋 Table of Contents

🤝 Community & Support

  • Issues: Report bugs and request features on GitHub

  • Discussions: Join the community forum for questions and ideas

  • Contributing: Read our contribution guidelines to get involved

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Indices and tables