DocumentationData Organization
Data Organization
Learn how to organize data and content workflows in Pynions.
Pynions uses a structured approach to organize data and content workflows.
Directory Structure
data/
├── output/ # All workflow outputs
│ └── [project]/ # Project-specific folders
│ ├── assets/ # Project-specific assets
│ └── [status]_[project]_[date].[ext]
└── raw/ # Original, unmodified data
├── scraped_data/ # Raw scraped content
└── logs/ # Application logs
Data Folders
data
: Processed data filesassets
: Related assets and resources
Workflow Status Types
Content goes through six (optional) stages in a typical workflow:
1_research
: Initial research and data gathering2_brief
: Content brief or outline3_outline
: Detailed content structure4_draft
: First version of content5_review
: Content under review6_final
: Final approved version
File Naming Convention
Files are automatically named using the following pattern:
[status]_[project]_[YYYY_MM_DD].[extension]
Examples:
1_research_best_mailchimp_alternatives_2024_ 03_09.md
2_brief_best_mailchimp_alternatives_2024_03_09.md
4_draft_best_mailchimp_alternatives_2024_03_09.md
Usage
Save content at different stages of your workflow:
from pynions.core.utils import save_result
# Save research content
save_result(
content="Research findings...",
project_name="best-mailchimp-alternatives",
status="research"
)
# Save draft content
save_result(
content="Draft content...",
project_name="best-mailchimp-alternatives",
status="draft"
)
# Save related data
save_result(
content='{"data": "metrics"}',
project_name="best-mailchimp-alternatives",
status="data",
extension="json"
)
Raw Data Storage
For storing raw data from various sources:
from pynions.core.utils import save_raw_data
# Save scraped content
save_raw_data(
content="Raw scraped content...",
source="serper",
data_type="scraped_data"
)
# Save log data
save_raw_data(
content="Log entry...",
source="workflow",
data_type="logs"
)
Configuration
Status types and their properties are configured in settings.json
:
{
"workflow": {
"status_types": {
"research": {
"description": "Initial research and data gathering",
"extensions": ["md", "txt"]
},
"brief": {
"description": "Content brief or outline",
"extensions": ["md"]
},
"draft": {
"description": "First version of content",
"extensions": ["md"]
}
// ... other status types
}
}
}
Best Practices
-
Project Names
- Use descriptive, hyphen-separated names
- Keep names consistent across related content
- Example: "best-mailchimp-alternatives"
-
Content Organization
- Create a new project folder for each content initiative
- Keep all related files within the project folder
- Use appropriate status types to track progress
-
Raw Data
- Always save original, unmodified data in the raw directory
- Use descriptive source names
- Include timestamps for tracking
-
File Extensions
- Use
.md
for content files (research, briefs, drafts) - Use
.json
for structured data - Use
.txt
for plain text and logs
- Use
Data Lifecycle
-
Creation
- Raw data is saved in appropriate raw/ subdirectories
- New project folders are created as needed
-
Processing
- Content moves through various status types
- Each stage saved with appropriate status
-
Completion
- Final content marked with 'final' status
- Raw data retained for reference
-
Maintenance
- Regular cleanup of old raw data
- Archive completed projects as needed
Updated 4 months ago
Edit this page