Module Input

Input mappings define how PyTorch tensors are converted into Scallop facts. They specify the domain (set of possible values) of input relations, allowing you to pass probability distributions as tensors and have them automatically converted to probabilistic facts.

What is an Input Mapping?

An input mapping establishes a correspondence between tensor indices and Scallop tuples:

input_mappings={"digit": range(10)}

This mapping says: “The digit relation has domain 0-9, and a tensor of shape (10,) represents probabilities for each digit.”

Example:

# Tensor: [0.1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.8, 0.1]
# Interpreted as:
# digit(0) with probability 0.1
# digit(8) with probability 0.8
# digit(9) with probability 0.1

Input Mapping Formats

Scallop supports multiple formats for defining input mappings:

Format 1: Range (Simple Integer Domain)

The most common format for integer-valued relations:

input_mappings={"digit": range(10)}
# Domain: digit(0), digit(1), ..., digit(9)
# Expected tensor shape: (10,) or (batch_size, 10)

Properties:

kind: "list"
shape: (10,)
dimension: 1
is_singleton: True (single-column relation)

Format 2: List (Explicit Enumeration)

For arbitrary values:

input_mappings={
  "color": ["red", "green", "blue"]
}
# Domain: color(red), color(green), color(blue)
# Expected tensor shape: (3,)

With tuples:

input_mappings={
  "edge": [(0,1), (1,2), (2,3), (0,3)]
}
# Domain: edge(0,1), edge(1,2), edge(2,3), edge(0,3)
# Expected tensor shape: (4,)

Properties:

kind: "list"
shape: (len(list),)
is_singleton: False if tuples, True if values

Format 3: Dictionary (Multi-Dimensional)

For relations with multiple columns, use a dictionary mapping dimension indices to value lists:

input_mappings={
  "edge": {
    0: range(5),  # First column: nodes 0-4
    1: range(5),  # Second column: nodes 0-4
  }
}
# Domain: all pairs (i, j) where i, j ∈ {0, 1, 2, 3, 4}
# Expected tensor shape: (5, 5) - 25 possible edges

Mixed types:

input_mappings={
  "likes": {
    0: ["alice", "bob", "charlie"],
    1: ["pizza", "salad", "burger"],
  }
}
# Domain: likes(person, food) for all person-food combinations
# Expected tensor shape: (3, 3)

Properties:

kind: "dict"
shape: (len(dim0), len(dim1), ...)
dimension: Number of dimensions
is_singleton: False

Format 4: Tuple (Fixed Constant)

For a single fixed tuple:

input_mappings={
  "start_node": (0,)
}
# Domain: start_node(0) only
# Expected tensor shape: ()

Properties:

kind: "tuple"
shape: ()
dimension: 0

Format 5: Value (Single Constant)

For a single value:

input_mappings={
  "threshold": 0.5
}
# Domain: threshold(0.5) only
# Expected tensor shape: ()

Properties:

kind: "value"
shape: ()
dimension: 0
is_singleton: True

Tensor Shapes and Batching

Input mappings automatically handle batching.

Single Example

If tensor shape matches the mapping shape exactly, it’s treated as a single example:

im = scallopy.InputMapping(range(10))
tensor = torch.randn(10)  # Single probability distribution

facts = im.process_tensor(tensor, batched=False)
# Returns: list of 10 facts

Batched Input

If tensor has an extra leading dimension, it’s treated as a batch:

im = scallopy.InputMapping(range(10))
tensor = torch.randn(16, 10)  # Batch of 16 distributions

facts = im.process_tensor(tensor, batched=False)
# Returns: list of 16 lists, each with 10 facts

Multi-Dimensional Mappings

For multi-dimensional mappings, the tensor shape must match:

im = scallopy.InputMapping({0: range(5), 1: range(3)})
tensor = torch.randn(5, 3)  # Single example
# OR
tensor = torch.randn(16, 5, 3)  # Batch of 16

facts = im.process_tensor(tensor)

Sparse Inputs

For large domains, you often don’t want to include all facts. Scallop provides filtering mechanisms:

Retain Top-K

Keep only the K highest-probability facts:

input_mappings={
  "digit": scallopy.InputMapping(
    range(10),
    retain_k=3  # Keep only top 3 digits
  )
}

# Tensor: [0.05, 0.02, 0.30, 0.01, 0.40, 0.03, 0.10, 0.02, 0.05, 0.02]
# After retain_k=3: only facts for indices 4 (0.40), 2 (0.30), 6 (0.10) are kept

With multi-dimensional mappings:

input_mappings={
  "edge": scallopy.InputMapping(
    {0: range(10), 1: range(10)},
    retain_k=5,  # Keep only top 5 edges across all 100 possibilities
  )
}

Per-dimension sampling:

input_mappings={
  "edge": scallopy.InputMapping(
    {0: range(10), 1: range(10)},
    retain_k=2,
    sample_dim=1,  # Keep top 2 for each value of dimension 1
  )
}
# Result: 10 * 2 = 20 facts (top 2 destinations for each source)

Retain Threshold

Keep only facts above a probability threshold:

input_mappings={
  "digit": scallopy.InputMapping(
    range(10),
    retain_threshold=0.1  # Only keep probabilities > 0.1
  )
}

# Tensor: [0.05, 0.02, 0.30, 0.01, 0.40, 0.03, 0.10, 0.02, 0.05, 0.02]
# After threshold: only facts for indices 2 (0.30), 4 (0.40) are kept
# Note: 0.10 is NOT kept (must be strictly greater than threshold)

Categorical Sampling

Instead of deterministic top-K, sample K facts according to their probabilities:

input_mappings={
  "digit": scallopy.InputMapping(
    range(10),
    retain_k=3,
    sample_strategy="categorical"  # Stochastic sampling
  )
}
# Each forward pass samples 3 different digits based on probabilities

Disjunctions in Input Mappings

When facts are mutually exclusive, mark them as disjunctive:

Global Disjunction

All facts in the relation are mutually exclusive:

input_mappings={
  "digit": scallopy.InputMapping(
    range(10),
    disjunctive=True
  )
}
# All 10 digit facts share one mutual exclusion ID

Per-Dimension Disjunction

Mutual exclusion along a specific dimension:

input_mappings={
  "color": scallopy.InputMapping(
    {0: range(3), 1: ["red", "green", "blue"]},
    disjunctive_dim=1  # Each object has mutually exclusive colors
  )
}
# color(0, red), color(0, green), color(0, blue) are mutually exclusive
# color(1, red), color(1, green), color(1, blue) are mutually exclusive
# But color(0, red) and color(1, red) are NOT mutually exclusive

Complete Example

Putting it all together:

import torch
import scallopy

# Create module with complex input mappings
module = scallopy.Module(
  provenance="difftopkproofs",
  k=3,
  program="""
    // Classify objects
    rel class(o, c) = color(o, col), shape(o, sh), classifier(col, sh, c)
  """,
  input_mappings={
    # Simple list
    "color": scallopy.InputMapping(
      {0: range(10), 1: ["red", "green", "blue"]},
      disjunctive_dim=1,  # Each object has one color
      retain_k=1,  # Keep most likely color per object
      sample_dim=1,
    ),

    # Multi-dimensional with threshold
    "shape": scallopy.InputMapping(
      {0: range(10), 1: ["circle", "square", "triangle"]},
      disjunctive_dim=1,
      retain_threshold=0.2,  # Only confident shapes
    ),

    # Fixed classifier (non-probabilistic)
    "classifier": [
      ("red", "circle", "apple"),
      ("green", "circle", "lime"),
      # ... more rules
    ],
  },
  output_mapping=("class", [(i, c) for i in range(10) for c in ["apple", "lime"]])
)

# Use with batched tensors
color_probs = torch.softmax(torch.randn(16, 10, 3), dim=2)
shape_probs = torch.softmax(torch.randn(16, 10, 3), dim=2)

result = module(color=color_probs, shape=shape_probs)
# Result shape: (16, 20) - batch of 16, (object, class) pairs

Summary

Input mappings define the domain of input relations
Five formats: range, list, dict, tuple, value
Batching is automatic - extra leading dimension = batch
Sparse inputs via retain_k, retain_threshold, or sample_strategy
Disjunctions mark mutually exclusive facts (global or per-dimension)
Properties: kind, shape, dimension, is_singleton

For more details:

Creating Modules - Overview of Scallop modules
Module Output - Output mappings
Configuring Provenance - Probability tracking

Keyboard shortcuts

Scallop Book