5 Replies Latest reply on Jul 27, 2010 8:05 AM by 601256

    Question about importing local database

    584150 New User

      Please let me know about the following two questions.

       

      Q1: Maybe, the number of column of EXCEL is limited

            since I could not open a Excel file of database.

            So it is necessary to reduce the number of  column of database.

            Please let me know how many columns are permitted.

            And if there is other rules or limitations for database,

            Please give more information.

       

      Q2: How can I easily import private database, which has several endpoint?

           If a database has several endpoint, should we import one endpoint by one?

           I like to import data from a private database in in-house computer,

           which database has many endpoints for a compound.

       

      I know that there is Good guidance, "Guidance on importing databases", is available on the website.

      But I need further information.

       

      Best regards

        • Re: Question about importing local database
          New User

          Hi,

          Firstly regarding Excel. Unless you are using the very latest version of Excel, you will be limited to 256 columns, but the Toolbox only requires a few fields to import new data.

          Your spreadsheet should have columns for:

          CAS number, SMILES, Data, and SIDS. Optionally you can have an number of other columns as indicated in the menu window during the import process. The SIDS column indicates the path for the endpoint. One way of finding out the existing endpoint SIDS path is to gather data for a compound and then expand the endpoint tree, then right click on the name of the endpoint of choice, (eg Ecotoxicological information/Aquatic Toxicity/Animalia/Crustaceans/Daphnia Magna/LC50/48h), and a drop down menu will appear, (see attachment). Select "Copy SIDS path" and this will then allow you to paste the path into the SIDS column of your own data. Each row of your data should have a copy of the SIDS path and thus you can use several different endpoints within the same spreadsheet by ensuring that the correct SIDS path goes in the appropriate rows.

          • Re: Question about importing local database
            584150 New User

            Thank you for your prompt reply.

             

            Your advice is very helpful.

            SoI could confirm your two answers about limited number of column and usage of SIDS by using toolbox.

            Now I realized SIDS path has very important and effective role.

             

            I have two additional questions or comments.

            Q1: Do you control the rule of making a new SIDS path?

                If each person makes a new SIDS path for a new subsidiary category without standard rule, several SIDS paths will be generated for the same endpoint locally.     

             

            Q2: Do you have the perfect list of existing endpoint SIDS path?

                To know whether an endpoint (including condition, duration, species, etc) is new or exiting, the existing endpoint SIDS path list is convenient. A temporary gathered endpoint-list for one chemical, even if it is benzene, should not be perfect.

            Basically only one SIDS path should be for one endpoint.

             

            Best regards

              • Re: Question about importing local database
                New User

                Hi,

                Yes the SIDS path points to the exact location for the endpoint data which means that any imported dat sets will then be included in the searches for grouping etc along with internal Toolbox data with the same endpoint. A good way to check this once your dataset is imported is to input one of the compounds in your set, go to the Endpoints screen and click on "Gather data"; if your dataset has been imported properly you should get the data reported back.

                You can indeed format a new SIDS endpoint by using the import wizard and following the same naming scheme (see below).

                To obtain the exact SIDS path expand the endpoint tree, right click on the endpoint of interest and then select "copy SIDS path" as in me attachment to my last message. You can then use this for your own Excel sheet. Condition, species, units etc are all optional fields which can be set up with the import wizard. Indeed one SIDS path is for one endpoint. However you can define several SIDS paths in one dataset if you dataset has a mixture of endpoints, so there is no need to import the data seperately.

              • Re: Question about importing local database
                584150 New User

                Thank you for prompt reply.

                 

                I have another question.

                It's about practical use of a screening data.

                 

                Some end point data is expressed in the range.

                A typical example is data of screening studies.

                For example, data of fish acute toxicity threshold screening study may be LD50 with “10 ppm or more” or “10 ppm or less”.

                Or, 10 - 100 ppm etc.

                That is, LD50 is expressed in the range and but not decided.

                Is practical use of such Screening data possible in OECD application toolbox?

                 

                Best regards

                  • Re: Question about importing local database
                    New User

                    I think that such data would be of limited use unless there was a very large dataset. When importing data into the toolbox you need to set the SIDS path so that it is in the same place as other data of exactly the same type, (species,duration test type and dimensions). I doubt if there is any data in exactly the format you describe so yours would need to be in a new endpoint path. This means that it would only be of use when you wished to predict that endpoint type again, and then the only data available for read across would be that same set you have entered, as there will be no other of the same kind already in the toolbox.

                    Also for read across you would need to transform the data to a purely numerical form (eg 0 for less than 10ppm and 1 for more than 10ppm). You'll note that some of the endpoints such as mutagenicity are binary, (ie 0=negative and 1=positive). If you do this you can include an explanatory column in the spreadsheet and point to it during the importing process so that in the detailed results you can see what the dimensions mean.

                     

                    Nick