Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhance]add JobAPI doc for OLAP algorithms in HugeServer #205

Open
JackyYangPassion opened this issue Apr 14, 2023 · 5 comments
Open

[Enhance]add JobAPI doc for OLAP algorithms in HugeServer #205

JackyYangPassion opened this issue Apr 14, 2023 · 5 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@JackyYangPassion
Copy link
Contributor

Problem Type (问题类型)

None

Your Question (问题描述)

To enable users to quickly apply integrated OLAP algorithms, it is recommended to add a job API to the document RESTful API module and provide instructions for usage.

  • AlgorithmAPI

Environment (环境信息)
Server Version: 1.0.0 (Apache Release Version)
Backend: MySQL 8.0.32
hugegraph-hubble
hugegraph-loader

For example Run LPA Algorithm job

  1. Load demo data into the hugegraph with hugegraph-loader

     bin/hugegraph-loader.sh -g hugegraph -f example/file/struct.json -s example/file/schema.groovy
    

    image

  2. Post http://localhost:8080/graphs/hugegraph/jobs/algorithm/lpa by PostMan

    image

  3. View the result of the algorithm execution in hugegraph-hubble
    Task status:
    image

    Lap algo summary
    image

    Graph node label detail:
    image

@JackyYangPassion
Copy link
Contributor Author

TODO: add AlgorithmAPI to DOC

@github-actions

This comment was marked as outdated.

@imbajin imbajin added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers and removed inactive labels May 17, 2023
@imbajin imbajin moved this to 📋 Backlog in HugeGraph Tasks May 17, 2023
@imbajin imbajin moved this from 📋 Backlog to 🆕 New in HugeGraph Tasks May 29, 2023
@imbajin imbajin removed the good first issue Good for newcomers label Apr 15, 2024
@moonbaseDelta
Copy link

Hi there,

I got through this to run some computation, but still cant figure out how the results are managed: on few algos they store in vertex property, on other its just calculations.

But take for instance degree_centrality - the results not saved into the graph, but still available through the API. It's useful when we work with smallest subsets around 10-1000 vertices\edges but on fairly large setup - like 10s of billions - I definitely want to tune that storing procedure.

@imbajin
Copy link
Member

imbajin commented Jul 22, 2024

Hi there,

I got through this to run some computation, but still cant figure out how the results are managed: on few algos they store in vertex property, on other its just calculations.

But take for instance degree_centrality - the results not saved into the graph, but still available through the API. It's useful when we work with smallest subsets around 10-1000 vertices\edges but on fairly large setup - like 10s of billions - I definitely want to tune that storing procedure.

@moonbaseDelta Hi, we support writing back to graph databases or local file/hdfs. You're right, writing back is generally a common requirement (u could also talk to 🤖 dosu in the comment like ↓)

Also, @dosu could u help add more context & information for users to use graph-computer?

Copy link

dosubot bot commented Jul 22, 2024

To manage and store the results of graph computations in HugeGraph, especially for large datasets, you can configure the system to write back results to the graph database or a file using the following options:

Writing Results Back to HugeGraph

You can configure HugeGraph to write computation results back to the graph database. Here are the relevant configuration options:

  • Configuration Options:

    • hugegraph.name: Set the graph name to load data and write results back.
    • hugegraph.url: Set the HugeGraph URL to load data and write results back.
    • output.result_write_type: Set the result write-type to output to HugeGraph. Allowed values are: OLAP_COMMON, OLAP_SECONDARY, OLAP_RANGE.
  • Example Configuration:

    hugegraph.name: hugegraph
    hugegraph.url: http://127.0.0.1:8080
    output.result_write_type: OLAP_COMMON

Writing Results to a File (HDFS)

Alternatively, you can write the results to a file, such as HDFS. Here are the relevant configuration options:

  • Configuration Options:

    • output.hdfs_path_prefix: Set the directory of HDFS output result.
    • output.hdfs_url: Set the HDFS URL of output.
    • output.hdfs_user: Set the HDFS user of output.
    • output.hdfs_delimiter: Set the delimiter of HDFS output.
    • output.hdfs_merge_partitions: Whether to merge output files of multiple partitions.
    • output.hdfs_replication: Set the replication number of HDFS.
  • Example Configuration:

    output.hdfs_path_prefix: /hugegraph-computer/results
    output.hdfs_url: hdfs://127.0.0.1:9000
    output.hdfs_user: hadoop
    output.hdfs_delimiter: ,
    output.hdfs_merge_partitions: true
    output.hdfs_replication: 3

Best Practices

To ensure efficient handling of large datasets, consider the following best practices:

  • Batch Size and Threads:

    • output.batch_size: Set the batch size of output.
    • output.batch_threads: Set the number of threads used to batch output.
  • Retry Mechanism:

    • output.retry_interval: Set the retry interval when output fails.
    • output.retry_times: Set the retry times when output fails.
  • Thread Pool Management:

    • output.single_threads: Set the number of threads used for single output.
    • output.thread_pool_shutdown_timeout: Set the timeout seconds of output threads pool shutdown.
  • Including Properties:

    • output.with_adjacent_edges: Whether to output the adjacent edges of the vertex.
    • output.with_edge_properties: Whether to output the properties of the edge.
    • output.with_vertex_properties: Whether to output the properties of the vertex.

By configuring these options, you can effectively manage and store the results of graph computations in HugeGraph, ensuring efficient handling of large datasets. Additionally, you can refer to the detailed configuration options provided in the HugeGraph documentation for more advanced settings and optimizations [1][2][3].

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
Status: 🆕 New
Development

No branches or pull requests

3 participants