[Enhance]add JobAPI doc for OLAP algorithms in HugeServer #205

JackyYangPassion · 2023-04-14T09:42:34Z

Problem Type (问题类型)

None

Your Question (问题描述)

To enable users to quickly apply integrated OLAP algorithms, it is recommended to add a job API to the document RESTful API module and provide instructions for usage.

AlgorithmAPI

Environment (环境信息)
Server Version: 1.0.0 (Apache Release Version)
Backend: MySQL 8.0.32
hugegraph-hubble
hugegraph-loader

For example Run LPA Algorithm job

Load demo data into the hugegraph with hugegraph-loader

 bin/hugegraph-loader.sh -g hugegraph -f example/file/struct.json -s example/file/schema.groovy

Post http://localhost:8080/graphs/hugegraph/jobs/algorithm/lpa by PostMan
View the result of the algorithm execution in hugegraph-hubble
Task status:

Lap algo summary

Graph node label detail:

The text was updated successfully, but these errors were encountered:

JackyYangPassion · 2023-04-14T09:43:11Z

TODO: add AlgorithmAPI to DOC

moonbaseDelta · 2024-07-21T07:57:26Z

Hi there,

I got through this to run some computation, but still cant figure out how the results are managed: on few algos they store in vertex property, on other its just calculations.

But take for instance degree_centrality - the results not saved into the graph, but still available through the API. It's useful when we work with smallest subsets around 10-1000 vertices\edges but on fairly large setup - like 10s of billions - I definitely want to tune that storing procedure.

imbajin · 2024-07-22T12:19:51Z

Hi there,

I got through this to run some computation, but still cant figure out how the results are managed: on few algos they store in vertex property, on other its just calculations.

But take for instance degree_centrality - the results not saved into the graph, but still available through the API. It's useful when we work with smallest subsets around 10-1000 vertices\edges but on fairly large setup - like 10s of billions - I definitely want to tune that storing procedure.

@moonbaseDelta Hi, we support writing back to graph databases or local file/hdfs. You're right, writing back is generally a common requirement (u could also talk to 🤖 dosu in the comment like ↓)

Also, @dosu could u help add more context & information for users to use graph-computer?

dosubot · 2024-07-22T12:26:31Z

To manage and store the results of graph computations in HugeGraph, especially for large datasets, you can configure the system to write back results to the graph database or a file using the following options:

Writing Results Back to HugeGraph

You can configure HugeGraph to write computation results back to the graph database. Here are the relevant configuration options:

Configuration Options:
- hugegraph.name: Set the graph name to load data and write results back.
- hugegraph.url: Set the HugeGraph URL to load data and write results back.
- output.result_write_type: Set the result write-type to output to HugeGraph. Allowed values are: OLAP_COMMON, OLAP_SECONDARY, OLAP_RANGE.

Example Configuration:

hugegraph.name: hugegraph
hugegraph.url: http://127.0.0.1:8080
output.result_write_type: OLAP_COMMON

Writing Results to a File (HDFS)

Alternatively, you can write the results to a file, such as HDFS. Here are the relevant configuration options:

Configuration Options:
- output.hdfs_path_prefix: Set the directory of HDFS output result.
- output.hdfs_url: Set the HDFS URL of output.
- output.hdfs_user: Set the HDFS user of output.
- output.hdfs_delimiter: Set the delimiter of HDFS output.
- output.hdfs_merge_partitions: Whether to merge output files of multiple partitions.
- output.hdfs_replication: Set the replication number of HDFS.

Example Configuration:

output.hdfs_path_prefix: /hugegraph-computer/results
output.hdfs_url: hdfs://127.0.0.1:9000
output.hdfs_user: hadoop
output.hdfs_delimiter: ,
output.hdfs_merge_partitions: true
output.hdfs_replication: 3

Best Practices

To ensure efficient handling of large datasets, consider the following best practices:

Batch Size and Threads:
- output.batch_size: Set the batch size of output.
- output.batch_threads: Set the number of threads used to batch output.
Retry Mechanism:
- output.retry_interval: Set the retry interval when output fails.
- output.retry_times: Set the retry times when output fails.
Thread Pool Management:
- output.single_threads: Set the number of threads used for single output.
- output.thread_pool_shutdown_timeout: Set the timeout seconds of output threads pool shutdown.
Including Properties:
- output.with_adjacent_edges: Whether to output the adjacent edges of the vertex.
- output.with_edge_properties: Whether to output the properties of the edge.
- output.with_vertex_properties: Whether to output the properties of the vertex.

By configuring these options, you can effectively manage and store the results of graph computations in HugeGraph, ensuring efficient handling of large datasets. Additionally, you can refer to the detailed configuration options provided in the HugeGraph documentation for more advanced settings and optimizations ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

JackyYangPassion mentioned this issue Apr 14, 2023

图计算（HugeGraph-Computer）完善建议 apache/incubator-hugegraph-computer#234

Open

This comment was marked as outdated.

Sign in to view

github-actions bot added the inactive label Apr 29, 2023

imbajin added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers and removed inactive labels May 17, 2023

imbajin added this to HugeGraph Tasks May 17, 2023

imbajin moved this to 📋 Backlog in HugeGraph Tasks May 17, 2023

imbajin moved this from 📋 Backlog to 🆕 New in HugeGraph Tasks May 29, 2023

imbajin removed the good first issue Good for newcomers label Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhance]add JobAPI doc for OLAP algorithms in HugeServer #205

[Enhance]add JobAPI doc for OLAP algorithms in HugeServer #205

JackyYangPassion commented Apr 14, 2023

JackyYangPassion commented Apr 14, 2023

This comment was marked as outdated.

moonbaseDelta commented Jul 21, 2024

imbajin commented Jul 22, 2024 •

edited

Loading

dosubot bot commented Jul 22, 2024

[Enhance]add JobAPI doc for OLAP algorithms in HugeServer #205

[Enhance]add JobAPI doc for OLAP algorithms in HugeServer #205

Comments

JackyYangPassion commented Apr 14, 2023

Problem Type (问题类型)

Your Question (问题描述)

JackyYangPassion commented Apr 14, 2023

This comment was marked as outdated.

moonbaseDelta commented Jul 21, 2024

imbajin commented Jul 22, 2024 • edited Loading

dosubot bot commented Jul 22, 2024

Writing Results Back to HugeGraph

Writing Results to a File (HDFS)

Best Practices

imbajin commented Jul 22, 2024 •

edited

Loading